UNIX Tips for Mac OS X
(new stuff is in red)
Here is a list of short tips on using various UNIX tools under Mac
OS X - some of them might just be reminders for myself. Some of them
are also applicable to other flavors of UNIX. Comments are welcome
(email address at the bottom).
(10.3.x+ only) How do I split a PDF file into
several from command-line?
Looking at the source code of join.py
mentioned in this
tip, I realized it'd be easy to adapt it into a script to do
the opposite: to split a PDF file into several files given a sequence
of splitting points (in terms of page numbers). This is exactly what
I did: you can download the script splitPDF.py
and
use it like this (make sure you did "chmod a+x splitPDF.py
" dance):
splitPDF.py input.pdf splitPageNum_1 ... splitPageNum_n
This will split the file input.pdf
into (n + 1) files.
This is best illustrated by an example:
splitPDF.py input.pdf 3 5
Assuming input.pdf
has 10 pages, you will get three files
back: input.part1.1_3.pdf
contains page 1-3, input.part2.4_5.pdf
contains
page 4-5, and input.part3.6_10.pdf
contains page 6-10
(note how the page ranges are part of the output filenames).
I should mention that each splitPageNum_i
should be an
integer between 1 and the number of pages of input.pdf
(inclusive),
and the entire sequence must be strictly increasing. Lastly this script
should work on both Panther and Tiger (Mac OS X 10.3.x/10.4.x).
Just for completeness sake, if you have ghostscript installed
(possibly via Fink), this
is how you can extract pages from a pdf file (all in one line):
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -dFirstPage=3 -dLastPage=5 -sOUTPUTFILE=input.3_5.pdf input.pdf
This will extract page 3-5 (inclusive) from input.pdf
into a new file input.3_5.pdf
.
(10.4.x+ only) How do I join multiple PDFs
into one from command-line?
(Thanks to Stan Jou for
pointing this script to me)
For some of us, sometimes we need to join/combine/concatenate multiple
PDF files into one PDF file for some reason. There have been multiple
ways to achieve this without buying extra piece of software. If you're
a Tiger (OS X 10.4.x) user, things are even a bit easier - it turns
out a Python script has already
been written for us by those kind Apple engineers - this script is
located at
/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py
(Actually the same script should work on Panther
(10.3.x) - but I cannot distribute it here)
You can make using the script a bit easier by creating a symlink for
it at a convenient place: (in one line - note the backslashes)
ln -s /System/Library/Automator/Combine\ PDF\
Pages.action/Contents/Resources/join.py joinPDF.py
Then you can use it to join, say input1.pdf
and input2.pdf
into
a file final.pdf
like this:
./joinPDF.py -o final.pdf input1.pdf input2.pdf
final.pdf
then is a concatenation of input1.pdf
and input2.pdf
,
in that order.
There is another option available, if you look at the source code
(note the option --preview
and --append
are
not really implemented, but I guess the latter is just equivalent to
the script default): that's --shuffle
. I'll just quote
the explanation from the code:
Take a page from each PDF input file in turn before taking
another from each file. If this option is not specified then all of
the pages from a PDF file are appended to the output PDF file before
the next input PDF file is processed.
Just for completeness sake, if you have ghostscript installed
(possibly via Fink), this
is how you can join multiple PDF files (all in one line):
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=final.pdf
input1.pdf input2.pdf
How do I get my IP address?
Many people are asking so I'm posting this Python script
for getting both the internal IP and the external IP
of your Mac: these two IPs might be different if you are behind a router
or a NAT device.
Most people will be interested in the external IP since this is the
IP the "outside world" sees you from. Here is the script:
#!/usr/bin/env python
import urllib, re, sys, os
# if this changes we need to revise the code to get the external IP
ip_telling_url = 'http://www.dyndns.org/cgi-bin/check_ip.cgi'
if len(sys.argv) == 1:
# get the external IP
mo = re.search(r'\d+\.\d+\.\d+\.\d+', urllib.urlopen(ip_telling_url).read())
if mo:
print mo.group()
else:
print 'Cannot get the external IP!'
else:
# get the internal IP of an interface
targetInt = sys.argv[1]
output = os.popen('ipconfig getifaddr %s 2>&1' % targetInt).read().strip()
if re.match(r'\d+\.\d+\.\d+\.\d+', output):
print output
else:
print 'Cannot get the internal IP for interface
\'%s\'' % targetInt
As usual save this script to a file say getip.py
and
do a "chmod a+x getip.py
" to make it executable. To get
the external IP, do this:
./getip.py
To get the internal IP of a specific network interface, say en0
,
do this:
./getip.py en0
Don't know what network interfaces are? They are the ethernet cards
inside your Mac. For example on my Powerbook I have en0
for
the ethernet gigabit interface (wired), and en1
for the
AirPort interface (wireless).
How do I mass-rename file extension?
(variable substitution in bash
)
Have you ever wanted to change file extensions over a bunch of files
with the same extension, e.g., change *.doc
to *.txt
?
You certainly don't want to do it one file at a time...
It turns out that bash
(the default shell in OS X) has
some nifty tricks to save us - it's called variable/parameter substitution.
Some of the most useful ones are:
- "
${var#pattern}
" and "${var##pattern}
":
Removes from the beginning of $var
the part
that matches pattern
; '#
' removes the shortest
possible match while '##
' removes the longest possible
match.
- "
${var%pattern}
" and "${var%%pattern}
":
Removes from the end of $var
the part that
matches pattern
; '%
' removes the shortest
possible match while '%%
' removes the longest possible
match.
- "
${var/pattern/replacement}
" and "${var//pattern/replacement}
":
Replaces the first match ('/
' version) or all matches
('//
' version) with replacement
.
Don't know what the heck that means? We'll conjure the second one
('%
') to do the work for us. Create the following script chgext
:
#!/bin/sh
for f in *.$1
do
mv $f ${f%$1}$2
done
As usual do a "chmod a+x chgext
" to make it executable.
For our example (change *.doc
into *.txt
),
use this command:
./chgext doc txt
To keep the tip short I'll only mention one more thing: you can use
wildcard '*
' in pattern
- that will match
all possible strings. But pattern
is not a regular
expression - '.
' won't be interpreted as it would in a
regex (so it'll only match a dot character).
How do I backup my stuff to an external
drive using rsync
?
Not everyone knows that in Mac OS X you don't need to buy expensive
software to do incremental backup. Using the UNIX command rsync
,
you can perform intelligent incremental backup, meaning you
only update your backup files with their latest versions - no wasteful
copying of the same files.
rsync
is a very flexible and powerful tool - you can
even do backup with a remote server. But I'll only show how you can
backup your files to an external drive (or to a different directory)
- do a 'man rsync
' to learn the other goodies it offers.
Here is the little script for this purpose:
#!/bin/sh
SOURCE_DIRS="Documents:Music:Pictures:Library/Mail:Downloaded stuff"
TARGET_DIR="/Volumes/External Drive"
# if the external drive is not there, complain and stop
if [ ! -e "$TARGET_DIR" ]
then
echo Target directory does not exist!
exit
fi
IFS=:
pushd .
cd ~/
/usr/bin/rsync -E --delete --progress -av $SOURCE_DIRS "$TARGET_DIR"
popd
The SOURCE_DIRS
is a list of folders you want to backup
- they are specified relative to your home folder, and are separated
using colon (`:
') - so in the script the directory "~/Downloaded
stuff
" (note that space in a directory's name is okay) will
be backed up. The TARGET_DIR
is the place where you want
to store the backup files: in this case an external drive with name
''External Drive
'' is used (again note that space in the
path is okay) - the backup files will be deposited directly under the
root directory of that drive. Feel free to customize both variables
to suit your needs. (thanks to Paul Henrich for pointing me to the
space-related problems)
Note a crucial option is added to the rsync
line (thanks
to Patrick Cunningham and Brian Ashe): the '-E
' switch
is a special addition to the Mac's built-in rsync
, which
copies extended
attributes and resource forks that are used in the HFS/HFS+ filesystem.
To make sure the right version of rsync
is used, I hard
coded the path of rsync
in the script.
To run it, save the file into say 'backup
', and make
it executable (chmod a+x ./backup
). After running it you
should expect to have an exact replica of the specified folders on
your external drive.
How do I find the pid of a process by its name?
In UNIX many chores involving processes require that you
know the pid (process ID) of the targeting processes before
you can do anything with them. Here is a simple script you can use
to find out the pid of a process by its name:
#!/bin/sh
ps axc|awk "{if (\$5==\"$1\") print \$1}"|tr '\n' ' '
Save it into a file, say 'pidof
', and make it executable
(chmod a+x pidof
). Then use it like this (assuming you're
in the same directory as that of pidof
):
./pidof Finder
That will give you the pid of the Finder process.
What if the "Open With" option in Finder
gives you duplicate apps or misses some app?
For some reason sometimes the database maintained by LaunchServices
is out of synch with "reality" (what apps are or are not on your hard
drive). Fortunately you can force it to rebuild the database by running
the following command:
/System/Library/Frameworks/ApplicationServices.framework/\
Frameworks/LaunchServices.framework/Support/lsregister \
-kill -r -domain local -domain system -domain user
Running lsregister
with no argument will tell you what
those options are for.
Reloading Cisco VPN kernel extension
From time to time my Cisco VPN client just gives me crap like "cannot
load kernel extension" or "cannot find a valid IP address" etc although
my connection is perfectly fine. In this case try the following in
Terminal.app:
sudo kextunload /System/Library/Extensions/CiscoVPN.kext
sudo kextload /System/Library/Extensions/CiscoVPN.kext
Setting environment variables for
GUI apps
As UNIX users we all know that the way to set up PATH variable (or
other environment variables) is to do that either in our .bash_profile
(if
the default shell is bash
) or .tcsh
file
(if the default shell is tcsh
). Unfortunately the graphical
apps do not get their paths from those settings. To do that you need
to create a file ~/.MacOSX/environment.plist
and add your
settings there. This
document at Apple will tell you the details.
Create a disk image file for a folder,
using hdiutil
Do this:
hdiutil create -srcfolder <src dir> -volname <volume
name> <.dmg name>
This creates a compressed .dmg
file with the name you
specified, and the image has the contents of the folder you specified,
and has a volume name you gave.
Tell if a process is still alive
This assumes you know the ID of the process you want to watch (if
you don't, see this tip). A simple line below
will give you the answer:
ps -p <process ID> -o pid | tail
-n 1 | grep -v PID
If the process is still alive, you'll get the process ID back. Otherwise
nothing will be printed. This is useful in building a larger script
where monitoring a process is necessary.
Send an email to a bunch of people from
command-line
This tip is actually a simple script written in Python -
a powerful scripting language shipped with Mac OS X.3 Panther. The
script allows you to send an email to multiple people on the command-line.
This would be useful, for example, in sending a periodic reminder to
a bunch of people with the help of the system scheduler, cron
.
Note this tip is useful not only for Panther users, but also for the
users of the other platforms that Python supports (e.g., Linux) as
well.
First, you need to create the following script: copy and paste the
content below into a file, say, smtp.py
(`py
'
is the default extension for Python scripts). Make sure you change
the line "smtpHost=...
"
to point to an SMTP server you are allowed to use. Now make it executable
by doing "chmod a+x smtp.py
".
#!/usr/bin/env python
import smtplib, sys, time
# change this to a new SMTP server if desired
smtpHost = 'some.smtp.mail.server'
# sys.argv[1] is the sender
# sys.argv[2] is the filename pointing to the list of recipients
# sys.argv[3] is the subject
# sys.argv[4] is the message content
if len(sys.argv) != 5:
print 'Usage: ./smtp <sender> <recipient FN> <subj> <msg
FN>'
sys.exit(1)
# each recipient takes one line; '#' signals comments
rList = []
for line in open(sys.argv[2]).readlines():
r = line.split('#')[0].strip()
if r: rList.append(r)
sender = sys.argv[1]
subj = sys.argv[3]
date = time.ctime(time.time())
msg = 'From: %s\nTo: %s\nDate: %s\nSubject: %s\n%s' \
% (sender, ', '.join(rList),
date, subj, open(sys.argv[4]).read())
server = smtplib.SMTP(smtpHost) # connect, no login step
failed = server.sendmail(sender, rList, msg)
server.quit()
if failed:
print 'smtp.py: Failed recipients:', failed
else:
print 'smtp.py: No errors.'
To use the script, do this:
./smtp.py <sender> <recipient FN> <subj> <msg
FN>
Among the arguments, `sender
' is your email address, "recipient
FN
" is a file containing a list of email addresses, with each
address taking on one line, `subj
' is the subject line
you want to use, and finally "msg FN
" is a file containing
the message body. For example, here is the command I used to send
out the weekly reminder for playing basketball:
./smtp.py spambot@die.die.die bbPlayers.txt "Don't
forget to play BB!" bbMsg.txt
A note about the sender argument: it has to be an email address from
a valid domain (so the line above won't work - it's deliberately garbled),
but other than that, there's no safety net to prevent you from impersonating
others - BUT DON'T. That's what spammers do; besides, in the email
header it'll clearly mark the originating IP address of the message,
so if someone WANTS to track you down, she/he WILL.
Clean up .DS_Store
files
Ever want to remove all those hidden .DS_Store
files
under some directory? You can do this:
find <directory> -name .DS_Store
-exec rm -f '{}' ';'
Or a faster version (thanks to Sean Kelly - the version above does
show you how to do artitrary things to the files though):
find <directory> -name .DS_Store
-delete
Just replace <directory> with the directory you want
to clean up - every sub-directory will be visited and cleaned up as
well.
Open a file with an app from the command-line
This is useful in two ways: (1) it saves you one trip to reach the
mouse in order to open some file; (2) it could force some app to open
a file that is usually not associated to the app. There are 3 possible
forms:
open <some file>
open -a <some app> <some file>
open -e <some file>
The first form opens a file with the default (associated) app; the
second form opens a file with the specified app; and the last one opens
a file with TextEdit.app.
Make focus follow mouse in
Terminal.app
This tip is from here.
Type this in the terminal:
defaults write com.apple.terminal FocusFollowsMouse
-string YES
The next time you start Terminal.app, when the mouse is over any Terminal.app
window, that window will receive the input focus (type away and you'll
know). Do the above with YES
replaced by NO
to
turn it off.
By the way, the defaults
command actually writes to an
app's preference file (.plist
); in this case, the file
modified is ~/Library/Preferences/com.apple.Terminal.plist
.
How do I avoid automatic launching of Xterm whenever
I start Apple's X11?
If you look at /etc/X11/xinit/xinitrc
, you'll find that
by default xterm is launched whenever you start Apple's X11. This could
be quite annoying if you don't want that. Fortunately the fix is very
simple. Just create your own .xinitrc
file under your
home directory, like this:
#!/bin/sh
exec quartz-wm
How do I add an alias to an IP address?
(updated 20031123: as of Panther (OS X.3), BSD flat files such as
/etc/hosts are enabled again. So you no longer need to use Netinfo.app
for this now)
This is another Netinfo-related tip. Sometimes we want to type a shortened
name of a machine to do various business with it; for example, instead
of typing `ssh 1.2.3.4
' we want to type `ssh mymachine
'.
For Linux/UNIX guys we know how to do this - just open up /etc/hosts
and
add an entry to it. In Mac OS X you need to fire up Netinfo.app (in /Applications/Utilities
)
instead: navigate yourself to /machines
, and add a new
entry per machine. For each entry you also need to provide ip_address
and name
properties.
(Of course for the ssh
example given above, you could
just modify your ~/.ssh/config
file - but that's another
story)
How do I change my default shell?
If you come from Linux/UNIX world like me, we all know where
to change the setting of a user's default shell (/etc/passwd
).
But OS X does things a little differently. To do that you need to fire
up Netinfo.app (in /Applications/Utilities
). Navigate
yourself to /users/<user name>
in Netinfo's
window, and find a property named "shell". The rest should be obvious.
(Thanks to Robin Breathe) It turns out you can also achieve this by
simply using the command-line utility chsh
. Say you want
to change your default shell from /bin/bash
(the default
for OS X) to /bin/tcsh
, just type this into the terminal:
chsh -s /bin/tcsh
Other nifty things are possible using chsh
- again, "man
chsh
" for more.
How do I suspend/resume a process (even
the GUI ones)?
This works both to a command-line or a GUI process. The latter case
is particularly needed, for example, when your video transcoding process
takes up almost 100% of CPU power and you want to do something that
requires at least some attention from the CPU...
Here you go: the first thing you need to do is to figure out the pid of
the process you want to suspend: see this tip.
After determining the pid (say it's 2209), I use the following command
to suspend that process:
kill -SIGSTOP 2209
Of course, do replace the pid with yours when you do this. Now the
music stops! The iTunes window is still there, but when you move the
mouse cursor over the window, all you get is the familiar beachball!
Ok enough for fun, how do we resume it? Use this:
kill -SIGCONT 2209
Now you have the total control - cool isn't it?
You can also combine this tip with the commands
above like this (at least in bash
shell, which is default
in OS X):
kill -SIGSTOP `pidof iTunes`
The command in the backquotes will first be executed to get the pid
of the process "iTunes", and the pid is then replacing the backquoted
portion of the command and the entire command is executed. So in effect
this stops iTunes.
(Thanks to Martin Dittus) Yet another way to completely bypass the
need of getting the pid first is to use command killall
,
like this
killall -SIGSTOP iTunes
Although from my own experience this sometimes is less robust than
the methods given above. To its credit you can even use regular expressions
with option '-m
' to specify the processes you want to
send the signal to (but be careful - you don't want to end up with
a completely frozen system!).
Building/installing the latest Emacs for OS
X (native) from CVS
(this assumes you already enabled the root account,
and you'll do this entirely under root)
Are you an Emacs user? If yes, don't you want to use a native OS X
version of Emacs instead of being trapped inside the terminal? And,
how about using the latest version of Emacs, directly shipped to you
from the CVS? Here is how.
(Thanks to the people who
have devoted in porting Emacs to OS X)
Here is a screenshot of Emacs OS X to motivate you...
So here is how you check out a copy of Emacs source code (you might
want to think about where you want to put this - it's about 90MB in
size):
cvs -z3 -d:pserver:anonymous@cvs.savannah.gnu.org:/sources/emacs
co emacs
This will start the downloading process, so wait for a while. After
that, create a shell script, say, emacs_build
with the
following content:
#!/bin/sh
cp -Rp emacs emacs.build
cd emacs.build
CFLAGS='-O3 -faltivec' CXXFLAGS='-O3 -faltivec' ./configure \
--enable-carbon-app=/Applications/Development --without-x
make bootstrap
Note the line starting with `CFLAGS
' is too long so I
have to break it into several segments (connected with '\'; if you
want to type them on one line, just remove the '\' - this applies to
the rest of this part), just note that the last line should be `make
bootstrap
'. Also, if you want to change the target folder you
want to install your Emacs into, change the `--enable-carbon-app=
'
setting to the correct folder (here I chose to install it in /Applications/Development
).
Lastly, note that the building process starts by copying the source
tree to another directory called emacs.build
- this will
prevent polluting the source tree.
Ok now make emacs_build
executable by issuing chmod
a+x ./emacs_build
in the terminal. Then execute it.
After about 30-40 minutes (depending on the speed of your machine),
the building process should finish. You can then do cd emacs.build;
make install
to install the whole thing. But better yet, create
a shell script emacs_install
with the following content:
#!/bin/sh
rm -rf /Applications/Development/Emacs.app/ /usr/local/bin/emacs* \
/usr/local/share/emacs/21.3.50/
cd emacs.build
make install
cd ..
rm -rf emacs.build
Again the line "rm -rf" is broken down into two segments using '\'.
Make this executable, and execute it. This script will remove the old
stuff first, install the new build, and wipe the build directory clean.
But what if, from time to time, you want to update the Emacs source
tree with the CVS, so you'll have the latest bugfixes? Create a shell
script called emacs_update
like the following:
#!/bin/sh
rm -f emacs/lisp/loaddefs.el*
cd emacs
cvs -z3 -d:pserver:anonymous@cvs.savannah.gnu.org:/sources/emacs update
cd ..
Again make it executable and then run it to update your emacs source
tree.. Of course this assumes you still keep the source tree around
(in emacs
directory) - the CVS update will only download
the necessary files.
How do I enable the root account?
By default the root account is disabled in Mac OS X - you
have to do everything that requires root privilege using sudo
instead,
and this could become quite annoying after a while. To enable the root
account, fire up the Netinfo.app (in /Applications/Utilities
),
and click on the Security menu - you'll see an item Enable Root
Account.
Philip Bruce sent in an alternative: just do a "sudo su
" and
you'll be dropped in a shell as root.
Firewall and iTunes sharing - what ports
to open?
If you have a firewall running on Mac, either configured via the built-in
Sharing Preferences panel, or from a third-party tool such as Flying
Buttress, make sure you open these ports to let iTunes traffic
through:
- Multicast-DNS (mDNS): this is UDP
port 5533. This port is necessary to be able to see all
the shared playlists (and to be able to automagically discover
all the Rendezvous-enabled services).
- DAAP: this is TCP port
3689. This port is for the actual iTunes data traffic.
Enabling only one of the above and not the other, you might only be
able to see the shared playlists but not able to play them, or vice
versa.
If you're using BrickHouse, which I highly recommend over Apple's
built-in firewall configuration interface, what you might end up with
is something like this:
And these two rules are at the beginning of my firewall rules. You
might also notice I only allow connections to/from 128.2.0.0 (CMU).
Who are listening to my tunes?
Ever wondering who those "n users connected" are when you
look at the Sharing part of the Preferences of your iTunes? Well wonder
no more - it turns out fairly easy to figure out in Terminal.app; just
type this line:
lsof -r 2 -n -P -F n -c iTunes -a -i TCP@`hostname`:3689
and it will tell you the IP address of each connection, together with
the music file the connection is listening to. lsof
is
a UNIX tool which is capable of listing the files a particular process
opens - including even the files being accessed from remote connections,
such as NFS, and iTunes in our case. Some short explanations of the
parameters used:
-r 2
: list the opened files repeatedly, and refresh
the list every 2 seconds.
-n
: show numeric IP address instead of domain names.
-P
: show port info in numbers instead of names (e.g.,
3689 vs. `daap').
-F n
: display field `name' (n).
-c iTunes
: only list files opened by process with
name 'iTunes'.
-a
: logical `and' to connect more than one conditions.
-i TCP@`hostname`:3689
: only list files involving
the specified address; in this case the protocol must be TCP, connecting
to my machine, and connecting to port 3689.
Of course based on the result you can do all kinds of fancy things,
like accumulating statistics about which tunes get listened to most
often, and so on. I'm too lazy to write an app like that - let me know
if you write one though!