In Files

Parent

Class/Module Index [+]

Quicksearch

RubyPants

RubyPants -- SmartyPants ported to Ruby

Synopsis

RubyPants is a Ruby port of the smart-quotes library SmartyPants.

The original "SmartyPants" is a free web publishing plug-in for Movable Type, Blosxom, and BBEdit that easily translates plain ASCII punctuation characters into "smart" typographic punctuation HTML entities.

Description

RubyPants can perform the following transformations:

This means you can write, edit, and save your posts using plain old ASCII straight quotes, plain dashes, and plain dots, but your published posts (and final HTML output) will appear with smart quotes, em-dashes, and proper ellipses.

RubyPants does not modify characters within <pre>, <code>, <kbd>, <math> or <script> tag blocks. Typically, these tags are used to display text where smart quotes and other "smart punctuation" would not be appropriate, such as source code or example markup.

Backslash Escapes

If you need to use literal straight quotes (or plain hyphens and periods), RubyPants accepts the following backslash escape sequences to force non-smart punctuation. It does so by transforming the escape sequence into a decimal-encoded HTML entity:

\\    \"    \'    \.    \-    \`

This is useful, for example, when you want to use straight quotes as foot and inch marks: 6'2" tall; a 17" iMac. (Use 6\'2\" resp. 17\".)

Algorithmic Shortcomings

One situation in which quotes will get curled the wrong way is when apostrophes are used at the start of leading contractions. For example:

'Twas the night before Christmas.

In the case above, RubyPants will turn the apostrophe into an opening single-quote, when in fact it should be a closing one. I don't think this problem can be solved in the general case--every word processor I've tried gets this wrong as well. In such cases, it's best to use the proper HTML entity for closing single-quotes ("&#8217;") by hand.

Bugs

To file bug reports or feature requests (except see above) please send email to: chneukirchen@gmail.com

If the bug involves quotes being curled the wrong way, please send example text to illustrate.

Authors

John Gruber did all of the hard work of writing this software in Perl for Movable Type and almost all of this useful documentation. Chad Miller ported it to Python to use with Pyblosxom.

Christian Neukirchen provided the Ruby port, as a general-purpose library that follows the *Cloth API.

Copyright and License

SmartyPants license:

Copyright (c) 2003 John Gruber (daringfireball.net) All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright owner or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.

RubyPants license

RubyPants is a derivative work of SmartyPants and smartypants.py.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright owner or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.

Links

John Gruber

daringfireball.net

SmartyPants

daringfireball.net/projects/smartypants

Chad Miller

web.chad.org

Christian Neukirchen

kronavita.de/chris

Constants

VERSION

Public Class Methods

new(string, options=[2]) click to toggle source

Create a new RubyPants instance with the text in string.

Allowed elements in the options array:

0

do nothing

1

enable all, using only em-dash shortcuts

2

enable all, using old school en- and em-dash shortcuts (default)

3

enable all, using inverted old school en and em-dash shortcuts

-1

stupefy (translate HTML entities to their ASCII-counterparts)

If you don't like any of these defaults, you can pass symbols to change RubyPants' behavior:

:quotes

quotes

:backticks

backtick quotes ("double" only)

:allbackticks

backtick quotes ("double" and `single')

:dashes

dashes

:oldschool

old school dashes

:inverted

inverted old school dashes

:ellipses

ellipses

:convertquotes

convert &quot; entities to " for Dreamweaver users

:stupefy

translate RubyPants HTML entities to their ASCII counterparts.

# File rubypants.rb, line 207
def initialize(string, options=[2])
  super string
  @options = [*options]
end

Public Instance Methods

to_html() click to toggle source

Apply SmartyPants transformations.

# File rubypants.rb, line 213
def to_html
  do_quotes = do_backticks = do_dashes = do_ellipses = do_stupify = nil
  convert_quotes = false

  if @options.include? 0
    # Do nothing.
    return self
  elsif @options.include? 1
    # Do everything, turn all options on.
    do_quotes = do_backticks = do_ellipses = true
    do_dashes = :normal
  elsif @options.include? 2
    # Do everything, turn all options on, use old school dash shorthand.
    do_quotes = do_backticks = do_ellipses = true
    do_dashes = :oldschool
  elsif @options.include? 3
    # Do everything, turn all options on, use inverted old school
    # dash shorthand.
    do_quotes = do_backticks = do_ellipses = true
    do_dashes = :inverted
  elsif @options.include?(-1)
    do_stupefy = true
  else
    do_quotes =                @options.include? :quotes
    do_backticks =             @options.include? :backticks
    do_backticks = :both    if @options.include? :allbackticks
    do_dashes = :normal     if @options.include? :dashes
    do_dashes = :oldschool  if @options.include? :oldschool
    do_dashes = :inverted   if @options.include? :inverted
    do_ellipses =              @options.include? :ellipses
    convert_quotes =           @options.include? :convertquotes
    do_stupefy =               @options.include? :stupefy
  end

  # Parse the HTML
  tokens = tokenize
  
  # Keep track of when we're inside <pre> or <code> tags.
  in_pre = false

  # Here is the result stored in.
  result = ""

  # This is a cheat, used to get some context for one-character
  # tokens that consist of just a quote char. What we do is remember
  # the last character of the previous text token, to use as context
  # to curl single- character quote tokens correctly.
  prev_token_last_char = nil

  tokens.each { |token|
    if token.first == :tag
      result << token[1]
      if token[1] =~ %<(/?)(?:pre|code|kbd|script|math)[\s>]!
        in_pre = ($1 != "/")  # Opening or closing tag?
      end
    else
      t = token[1]

      # Remember last char of this token before processing.
      last_char = t[-1].chr

      unless in_pre
        t = process_escapes t
        
        t.gsub!(/&quot;/, '"')  if convert_quotes

        if do_dashes
          t = educate_dashes t            if do_dashes == :normal
          t = educate_dashes_oldschool t  if do_dashes == :oldschool
          t = educate_dashes_inverted t   if do_dashes == :inverted
        end

        t = educate_ellipses t  if do_ellipses

        # Note: backticks need to be processed before quotes.
        if do_backticks
          t = educate_backticks t
          t = educate_single_backticks t  if do_backticks == :both
        end

        if do_quotes
          if t == "'"
            # Special case: single-character ' token
            if prev_token_last_char =~ /\S/
              t = "&#8217;"
            else
              t = "&#8216;"
            end
          elsif t == '"'
            # Special case: single-character " token
            if prev_token_last_char =~ /\S/
              t = "&#8221;"
            else
              t = "&#8220;"
            end
          else
            # Normal case:                  
            t = educate_quotes t
          end
        end

        t = stupefy_entities t  if do_stupefy
      end

      prev_token_last_char = last_char
      result << t
    end
  }

  # Done
  result
end

Protected Instance Methods

educate_backticks(str) click to toggle source

Return the string, with "``backticks''"-style single quotes translated into HTML curly quote entities.

# File rubypants.rb, line 384
def educate_backticks(str)
  str.gsub("``", '&#8220;').gsub("''", '&#8221;')
end
educate_dashes(str) click to toggle source

The string, with each instance of "--" translated to an em-dash HTML entity.

# File rubypants.rb, line 347
def educate_dashes(str)
  str.gsub(/--/, '&#8212;')
end
educate_dashes_inverted(str) click to toggle source

Return the string, with each instance of "--" translated to an em-dash HTML entity, and each "---" translated to an en-dash HTML entity. Two reasons why: First, unlike the en- and em-dash syntax supported by educate_dashes_oldschool, it's compatible with existing entries written before SmartyPants 1.1, back when "--" was only used for em-dashes. Second, em-dashes are more common than en-dashes, and so it sort of makes sense that the shortcut should be shorter to type. (Thanks to Aaron Swartz for the idea.)

# File rubypants.rb, line 369
def educate_dashes_inverted(str)
  str.gsub(/---/, '&#8211;').gsub(/--/, '&#8212;')
end
educate_dashes_oldschool(str) click to toggle source

The string, with each instance of "--" translated to an en-dash HTML entity, and each "---" translated to an em-dash HTML entity.

# File rubypants.rb, line 355
def educate_dashes_oldschool(str)
  str.gsub(/---/, '&#8212;').gsub(/--/, '&#8211;')
end
educate_ellipses(str) click to toggle source

Return the string, with each instance of "..." translated to an ellipsis HTML entity. Also converts the case where there are spaces between the dots.

# File rubypants.rb, line 377
def educate_ellipses(str)
  str.gsub('...', '&#8230;').gsub('. . .', '&#8230;')
end
educate_quotes(str) click to toggle source

Return the string, with "educated" curly quote HTML entities.

# File rubypants.rb, line 397
def educate_quotes(str)
  punct_class = '[!"#\$\%\()*+,\-.\/:;<=>?\@\[\\\]\^_`{|}~]'

  str = str.dup
    
  # Special case if the very first character is a quote followed by
  # punctuation at a non-word-break. Close the quotes by brute
  # force:
  str.gsub!(/^'(?=#{punct_class}\B)/, '&#8217;')
  str.gsub!(/^"(?=#{punct_class}\B)/, '&#8221;')

  # Special case for double sets of quotes, e.g.:
  #   <p>He said, "'Quoted' words in a larger quote."</p>
  str.gsub!(/"'(?=\w)/, '&#8220;&#8216;')
  str.gsub!(/'"(?=\w)/, '&#8216;&#8220;')

  # Special case for decade abbreviations (the '80s):
  str.gsub!(/'(?=\d\ds)/, '&#8217;')

  close_class = %[^\ \t\r\n\\[\{\(\-]!
  dec_dashes = '&#8211;|&#8212;'
  
  # Get most opening single quotes:
  str.gsub!(/(\s|&nbsp;|--|&[mn]dash;|#{dec_dashes}|&#x201[34];)'(?=\w)/,
           '\1&#8216;')
  # Single closing quotes:
  str.gsub!(/(#{close_class})'/, '\1&#8217;')
  str.gsub!(/'(\s|s\b|$)/, '&#8217;\1')
  # Any remaining single quotes should be opening ones:
  str.gsub!(/'/, '&#8216;')

  # Get most opening double quotes:
  str.gsub!(/(\s|&nbsp;|--|&[mn]dash;|#{dec_dashes}|&#x201[34];)"(?=\w)/,
           '\1&#8220;')
  # Double closing quotes:
  str.gsub!(/(#{close_class})"/, '\1&#8221;')
  str.gsub!(/"(\s|s\b|$)/, '&#8221;\1')
  # Any remaining quotes should be opening ones:
  str.gsub!(/"/, '&#8220;')

  str
end
educate_single_backticks(str) click to toggle source

Return the string, with "`backticks'"-style single quotes translated into HTML curly quote entities.

# File rubypants.rb, line 391
def educate_single_backticks(str)
  str.gsub("`", '&#8216;').gsub("'", '&#8217;')
end
process_escapes(str) click to toggle source

Return the string, with after processing the following backslash escape sequences. This is useful if you want to force a "dumb" quote or other character to appear.

Escaped are:

\\    \"    \'    \.    \-    \`
# File rubypants.rb, line 335
def process_escapes(str)
  str.gsub('\\', '&#92;').
    gsub('\"', '&#34;').
    gsub("\\\'", '&#39;').
    gsub('\.', '&#46;').
    gsub('\-', '&#45;').
    gsub('\`', '&#96;')
end
stupefy_entities(str) click to toggle source

Return the string, with each RubyPants HTML entity translated to its ASCII counterpart.

Note: This is not reversible (but exactly the same as in SmartyPants)

# File rubypants.rb, line 445
def stupefy_entities(str)
  str.
    gsub(/&#8211;/, '-').      # en-dash
    gsub(/&#8212;/, '--').     # em-dash
    
    gsub(/&#8216;/, "'").      # open single quote
    gsub(/&#8217;/, "'").      # close single quote
    
    gsub(/&#8220;/, '"').      # open double quote
    gsub(/&#8221;/, '"').      # close double quote
    
    gsub(/&#8230;/, '...')     # ellipsis
end
tokenize() click to toggle source

Return an array of the tokens comprising the string. Each token is either a tag (possibly with nested, tags contained therein, such as <a href="<MTFoo>">, or a run of text between tags. Each element of the array is a two-element array; the first is either :tag or :text; the second is the actual value.

Based on the _tokenize() subroutine from Brad Choate's MTRegex plugin. <www.bradchoate.com/past/mtregex.php>

This is actually the easier variant using tag_soup, as used by Chad Miller in the Python port of SmartyPants.

# File rubypants.rb, line 471
def tokenize
  tag_soup = /([^<]*)(<[^>]*>)/

  tokens = []

  prev_end = 0
  scan(tag_soup) {
    tokens << [:text, $1]  if $1 != ""
    tokens << [:tag, $2]
    
    prev_end = $~.end(0)
  }

  if prev_end < size
    tokens << [:text, self[prev_end..-1]]
  end

  tokens
end

[Validate]

Generated with the Darkfish Rdoc Generator 2.