Regular expressions

A collection of notes on Regular Expressions in a variety of situations.


// Remove beginning comma

// Remove beginning space
^ (.*)$ 

// Remove beginning three characters

// Remove ending comma

// Remove everything after "?" in URL 

// Remove number(s) from end of URL

// Remove number sequence in string
target="basefrm" id="itemTextLink[0-9]+"

// Remove "#31;#" from inside of string 

// Remove all lines that begin with "at" in a log file 

// Remove all lines that contain "mapMappableContainerException"


Add regex pattern from

Log Files


Remove Smartlogic errors

.*parent not present.*
.*Unrecognized attribute.* 

Remove error details

.*at sun.*
.*at java.*
.*at com.*
.*at org.*
.*at twigkit.*

Remove webapp restart

.*common frames omitted.*
^SEVERE: The web application.*
^INFO: Binding twigkit.*
^INFO: Registering twigkit.*

Search - Lucidworks Fusion

Escape special characters in Solr Partial Update Indexer ID field

function(doc) {
	if (doc.getId() !== null) {

		// get the ID
		var new_id = doc.getId();

		// escape dashes
		new_id = id.replace(/-/g,"\-");

		// change the id field
		doc.setField(id, new_id);
	return doc;

ERStudio Web Macro

Create a CSV file of Data Models from HTML file

  • Upload ERStudio Web content to server
  • Open web page in Chrome browser to index.htm
  • Enable Developer Tools -> Elements
  • Expand Data Model View to show all Physical Data Models
  • Select the <html> element in the frame containing name="treeframe" and  
    Copy OuterHTML
  • Paste into Notepad++ and format using Tiny2 plugin
  • Remove HTML using Regex (create as macro)
// find and replace with empty
.*(<body |body>|<div |div>|head>|html>|script>|style>|<table |table>|title>|<tr |tr>|(A {|BODY {|TD {|content=|initializeDocument|meta name)|<td valign=|</a>|</td>|.*src="Support.*|alt="mauyong">|.*alt=.*/>$|.*href='javascript:clickOnNode.*alt="|" target="basefrm").*

// find and replace with empty
" target="basefrm"

// pagetitle
// find
// replace with

// pageurl
// find
.*<a href="
// replace with

// find
// replace with 

// find
.htm id=.*$
// replace with 

// remove empty lines
Edit -> Line Operations -> Remove Empty Lines

// collect items on same line
// find
// replace with

// remove extra lines
Main Model
Core Physical Data Model

// add column headings




GNU Regular Expressions

Java Regular Expressions