[Eug-lug] To Horst-Q/colours

T. Joseph CARTER knghtbrd at bluecherry.net
Tue Jan 4 15:53:08 PST 2005


On Tue, Jan 04, 2005 at 12:07:33PM +0000, walter fry wrote:
> I have pencil on paper copied this command string and thought it wiser to 
> ask whether this might be destructive if I entered this as is...is it ok to 
> use?

Let me demonstrate it with real directories...

The first part:

knghtbrd at yumi:~$ find /usr/local/lib -type l -exec ls -l {} \;
lrwxr-xr-x    1 root     wheel          23 Dec 14 07:11
/usr/local/lib/libpisock++.0.dylib -> libpisock++.0.0.0.dylib
lrwxr-xr-x    1 root     wheel          23 Dec 14 07:11
/usr/local/lib/libpisock++.dylib -> libpisock++.0.0.0.dylib
lrwxr-xr-x    1 root     wheel          21 Dec 14 07:10
/usr/local/lib/libpisock.9.dylib -> libpisock.9.0.0.dylib
lrwxr-xr-x    1 root     wheel          21 Dec 14 07:10
/usr/local/lib/libpisock.dylib -> libpisock.9.0.0.dylib
lrwxr-xr-x    1 root     wheel          21 Dec 14 07:10
/usr/local/lib/libpisync.0.dylib -> libpisync.0.0.2.dylib
lrwxr-xr-x    1 root     wheel          21 Dec 14 07:10
/usr/local/lib/libpisync.dylib -> libpisync.0.0.2.dylib

Okay, so what that does is do ls -l of all symlinks in the target
directory, /usr/local/lib in this case (chosen because it doesn't have
much in it..)  If I press up-arrow to re-enter the original line (it works
like the doskey command), add a | and a \ and hit enter, I can enter a
longer line without wrapping off the edge of the terminal.  I'll put the
sed part on the second line:

knghtbrd at yumi:~$ find /usr/local/lib -type l -exec ls -l {} \; | \                                 
> sed 's/^.* \([^ ].*\) -> \(.*\)/ln -s \2 \/usr\/foo\/\1'/
ln -s libpisock++.0.0.0.dylib /usr/foo//usr/local/lib/libpisock++.0.dylib
ln -s libpisock++.0.0.0.dylib /usr/foo//usr/local/lib/libpisock++.dylib
ln -s libpisock.9.0.0.dylib /usr/foo//usr/local/lib/libpisock.9.dylib
ln -s libpisock.9.0.0.dylib /usr/foo//usr/local/lib/libpisock.dylib
ln -s libpisync.0.0.2.dylib /usr/foo//usr/local/lib/libpisync.0.dylib
ln -s libpisync.0.0.2.dylib /usr/foo//usr/local/lib/libpisync.dylib

(note you don't type the > on the second line, that's the shell's way of
telling you that it's waiting for you to finish the command on the second
line..)  If you look at the output, you'll see something clever.  That
ugly sed line turned the ls output into a set of ln commands.  UNIX has a
thing called a symbolic link--it's a sort of alias for files.  ln -s is
the command to create them.  You tell it what you want the link to point
to and then where you want the link.  The syntax is ln -s <dest> <source>,
and <dest> is either relative to <source> or specifies a full path
starting with /.

What's disgusting about it is the sed line itself!  Regular expressions
are nasty (but useful) little bits of what look like line noise that
specify a pattern to look for.  Above there is the s/// command to sed,
which takes a regular expression after the first / and a replacement
pattern after the second /.  The third / ends the replacement, and there
can be a couple of flags after it.  In the replacement pattern, \1 and \2
match subpatterns in order of appearance in the first part of the command
(using parentheses makes a subpattern).  These regular expressions are
used by less and grep for searching, by sed, ed, and vi for lots of
things, and they're what makes perl (the sysadmin's swiss army chainsaw)
do the things it can do.  They're a mixed blessing, and a good example of
why UNIX is considered unfriendly.

Anyway, what the full command Larry described does, with the missing third
/ added, is this:

- find all symlinks in /usr/local/lib
- ls -l them so we can see what they are and where they point to
- use a very clever search-and-replace to turn them into ln commands
- write the ln commands to a file named tmp
- evaluate tmp as if it were a set of commands entered at the prompt
- delete the file tmp


Here's the basic rules of regular expressions

	.	matches one character
	?	the thing before it may or may not be there
	*	matches any number of the thing before it (including 0)
	^	matches the start of a line
	$	matches the end of a line
	[]	matches any one of what's inside it
	()	declares a subpattern
	+	(not in sed) matches one or more of the thing before it
	|	(not in sed) logical or
	\	escapes next character (note multiple escapes!)

It's not like wildcard matching for filenames.  In wildcard filenames, all
JPEG images might be "*.jpg", but in regex speak it would be ".*\.jpg".
That regular expression reads literally "match any number of any character
followed by a literal . and the letters jpg".

I thought + should be in sed, and it might be in GNU sed, but it's not
portable apparently.

[] is kinda cool.  [abc] will match any of a, b, or c.  [0-9] will match
any digit, [-0-9] shows how to include "-" in the match (it must be
first), and [^0-9] means "anything but digits".  Want to match a price in
dollars and cents?  '\$ *[0-9]*(\.[0-9][0-9])?'  That's "A literal dollar
sign, followed (maybe) by spaces, a number of digits, possibly followed by
a literal . and two more digits".

You might have noticed I used single quotes around the dollars expression.
You'll want to do the same until you get used to it a bit more (and often
even then!)  The reason is that many of the same characters that are
special to regular expressions are special to shells.  * and ? for sure,
but also (), [], {}, \, and | are all special characters that the shell
might expand.  Any of these can be escaped using \, but then if you want a
\ to escape something in the expression you will need to escape your
escape, which is called "double escaping", "double quoting", or just
"yuck!" ;)

Generally, you feed sed the -e argument before an expression, but if there
is only one you can get away with not doing it, as Larry and I both did in
this case.  Technically, the s command expression can be followed by any
character, not just /, but we often write it as s/// so people know what
we're talking about.  You can use s,,, or s}}} or any character instead of
/ as long as you use the same one.  Often when people are using s/// on
filenames, they'll use something other than / so they don't have to type
\/ every time they want a / for a filename.  Also, I said something before
about flags.  The one you want to know is g, for global.  s/// will match
one pattern and replace it.  s///g will match as many patterns as can be
matched and replace them all.

You CAN play with sed's s/// command to see how to use it safely using
echo and cat, as long as you don't write the output using > over any file.
echo "somestring" | sed -e 's/es/e s/' for example, is completely harmless
and shows the trivial case of using sed.  Try it out, I'm sure you'll get
the hang of it.  =)


More information about the EUGLUG mailing list