Sitios que en el día a día me parecen interesantes, pero que no tengo tiempo de revisar en el momento, así que los registro en este blog de tal forma que pueda leerlos posteriormente.

lunes, noviembre 20, 2006

Expresiones regulares básicas para Open Office

La siguiente tabla resume algunas de las expresiones regulares básicas, que se pueden usar en Open Office, usualmente en los diálogos de Búsqueda y Reemplazo. Esta información (y mucha otra más) se puede ver en este enlace "Writer - Find Text using Wildcards".
CharacterResult / Use
Any character
Stands for any single character unless otherwise specified below.

.

Represents any single character except for a line break or paragraph break. For example, the search term "o.en" returns both "oven" and "open".

{2}
Defines the excact number of times that the character in front of the opening bracket occurs. For example, "tre{2}" finds "tree".

{1,2}
Defines the range of number of times that the character in front of the opening bracket can occur. For example, "tre{1,2}" finds both "tree" and "treated".

{1,}
Defines the minimum number of times that the character in front of the opening bracket can occur. For example, "tre{2,}" finds "tree", "treee", and "treeeee".

*

Shorthand for "{0,}". Represents zero or more occurrences of the character in front of the "*". For example, "Ab*C" finds "AC", "AbC", "Abbc", "AbbbC", and so on.

+

Shorthand for "{1,}". Represents one or more of the characters in front of the "+". For example, "Ab+C" finds "AbC", but not "AC". If you are searching for "A.+C", the longest possible string that matches this search pattern in a paragraph is always found. If the paragraph contains the string "AbCbbbC", the entire passage is highlighted.

?

Shorthand for "{0,1}". Represents zero or one of the characters in front of the "?". For example, "Documents?" finds "Document" and "Documents" and "x(ab|c)?y" finds "xy", "xaby", or "xcy".

\
Find interprets the special character that follows the "\" as a normal character and not as a regular expression (except for the combinations \n, \t, \>, and \<). For example, "source\." finds "source.", not "sourced" or "sources".

\n

Represents a line break that was inserted with the Shift+Enter key combination. To change a line break into a paragraph break, enter \n in the Search for and Replace with boxes, and then perform a find and replace.

\t

Represents a tab. You can also use this expression in the Replace with box.

\>

Represents the end of a word. In other words, only finds the search term if it appears at the end of a word. For example, "book\>" finds "checkbook", but not "bookmark".

\<

Represents the beginning of a word. In othere words, only finds the search term if it appears at the beginning of a word. For example, "\<book" finds "bookmark", but not "checkbook".

^
Represents the beginning of a paragraph. In other words, only finds the search term if the term is at beginning of a paragraph. Special objects such as empty fields or character-anchored frames, at the beginning of a paragraph are ignored. Example: "^OpenOffice".

$

Represents the end of a paragraph. In other words, only finds the search term if the term appears at the end of the paragraph. Special objects such as empty fields or character-anchored frames at the end of a paragraph are ignored. Example: "OpenOffice.org$".

&

Adds the string that was found by the search criteria in the Search for box to the term in the Replace with box when you make a replacement.
For example, if you enter "window" in the Search for box and "&frame" in the Replace with box, the word "window" is replaced with "windowframe".
You can also enter an "&" in the Replace with box to modify the Attributes or the Format of the string found by the search criteria.

[abc123]

Represents a single one of the characters that are between the brackets.

[a-e]

Shorthand for "[abcde]". Represents any single character in the range between a and e.

[a-eh-x]
Shorthand for "[abcdehijklmnopqrstuvwx]". Represents any single character in the range between a-e or h-x.

[^a-s]
Represents any single character that is not in the range between a and s.

\xXXXX
Represents a special character based on its four-digit hexadecimal code (XXXX).
The code for the special character depends on the font used. You can view the codes by choosing Insert - Special Character.

|
Finds the terms that occur before or after the "|". For example, "this|that" finds "this" or "that".

(...)
Defines the characters inside the parentheses as a reference. You can then refer to the first reference in the current expression with "\1", to the second reference with "\2", and so on.
For example, if your text contains the number 13487889 and you search using the regular expression (8)7\1\1, "8788" is found.
You can also use () to group terms, for example, "a(bc)?d" finds "ad" or "abcd".

[:digit:]
Represents a decimal digit.

[:space:]
Represents a white space character such as space.

[:print:]
Represents a printable character.

[:cntrl:]
Represents a nonprinting character.

[:alnum:]
Represents an alphanumeric character ([:alpha:] and [:digit:]).

[:alpha:]
Represents an alphabetic character.

[:lower:]
Represents a lowercase character if Match case is selected in Options.

[:upper:]
Represents an uppercase character if Match case is selected in Options.

For a logical search expression with nested AND and OR operators, use parentheses.

J.