This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
public:nnels:etext:regex [2018/07/11 22:31] leah.brochu |
public:nnels:etext:regex [2018/07/12 17:07] leah.brochu |
||
---|---|---|---|
Line 23: | Line 23: | ||
[[https:// | [[https:// | ||
- | * Word has a lot of options to find letters (^$) and numbers (^#) but these only work with the wildcard option //off// (which it is by default). Only turn the wildcard option on if you're using regex options. Read the info page carefully on when things apply with the wildcard option on/off. | + | * Word has a lot of options to find letters (^$) and numbers (^#) when using the non-regex [[public: |
* A lot of the codes for special characters (e.g. page break) are under the " | * A lot of the codes for special characters (e.g. page break) are under the " | ||
Line 77: | Line 77: | ||
<WRAP center round box 80%> | <WRAP center round box 80%> | ||
- | **PROBLEM: | + | **PROBLEM: |
**SOLUTION: | **SOLUTION: | ||
Line 92: | Line 92: | ||
<WRAP center round box 80%> | <WRAP center round box 80%> | ||
- | **PROBLEM**: | ||
- | **SOLUTION**: Use MS Word' | + | **PROBLEM:** OCR did not recognize spaces around quotation marks. |
+ | * Example A: As one of Montgomery' | ||
+ | * Example B: The "nasty little '' | ||
+ | This problem has an added complexity; the pattern has two different solutions: | ||
+ | * Example A will need to say: ... later put '' | ||
+ | * Example B will need to say: The "nasty little troublemaker''," | ||
- | Find: '' | + | **SOLUTIONS: |
+ | Example A:\\ | ||
+ | |||
+ | Find: '' | ||
+ | Replace: '' | ||
+ | |||
+ | Example B: | ||
+ | |||
+ | Find: '' | ||
+ | Replace: '' | ||
+ | |||
+ | Notes: | ||
+ | * You will **not** be able to use " | ||
+ | * You will also need to re-do this, searching for periods instead of commas. | ||
- | Replace with: '' | ||
</ | </ | ||
---- | ---- | ||
+ | |||
<WRAP center round box 80%> | <WRAP center round box 80%> | ||
- | **PROBLEM**: | + | **PROBLEM**: |
- | **SOLUTION**: | + | **SOLUTION**: |
+ | </ | ||
- | Find: '' | + | ---- |
- | Replace with: '' | + | <WRAP center round box 80%> |
+ | **PROBLEM**: There are newlines/ | ||
- | In LibreOffice, | + | **SOLUTION**: |
</ | </ | ||
Line 121: | Line 140: | ||
'' | '' | ||
- | **SOLUTION**: | + | **SOLUTION**: |
- | + | ||
- | Find: '' | + | |
- | + | ||
- | Replace with: nothing. If you're doing a paginated title, replace with page breaks. | + | |
- | + | ||
- | You will need to remove one of the ^# at the beginning and after the .indd to remove it for 2 digit page numbers, and one last time for single digit page numbers. The following screenshot is an example with a 1-digit page number (see below), followed by the command used to isolate all such instances. | + | |
- | + | ||
- | <WRAP center round box 60%> | + | |
- | + | ||
- | {{:nnels:documentation:content: | + | |
- | + | ||
- | Find: ^# | + | |
- | </ | + | |
- | + | ||
- | You will also need to do it with the leading ^#^p to catch the footer text that do not have any page numbers with it. | + | |
</ | </ | ||