Filed under: Software updates
Starting processes at boot time, and keeping them running, is not glamorous but it’s important (if your computer gets power cycled, it’s nice to have your applications bounce right back… but I’m sure you already do this).
I put together a utility called pmtr (process monitor) to run my application daemons at boot time, and restart them as needed. I wanted to keep all of these jobs defined in one place– one configuration file. No rooting around in sysvinit files. Just my application’s jobs, all in one place. Like this,
/etc/pmtr.conf
job {
name weather-server
cmd /usr/local/bin/weather -v
}
# this one keeps stderr and stdout
job {
name doppler-update
cmd /usr/bin/dop-up
dir /var/log
err dop.err
out dop.out
}
|
That’s a simple pmtr.conf with just two jobs. No rocket science – just two jobs, in one place. There’s a handful of job configuration options it supports.
A friend asked me to support “deltas” if you change the configuration file- and this turned into one of the most useful features. It’s really two handy features that go together: pmtr monitors the file (using inotify) for changes so you don’t have to signal it when you make a change; then, when it rescans the configuration, it only fiddles with the jobs you’ve actually changed. You can disable or reconfigure jobs, or add new ones, and have that take effect without disrupting the other ones.
It’s not sysvinit, it’s not upstart, it’s not sophisticated. It’s just … handy.
You can read more about pmtr on it’s GitHub page.
Filed under: Software updates
I’ve posted my ØMQ zcontrol control port library to GitHub tonight. This makes it easy to embed an interactive control port in your ØMQ (v2 or v3)-based C programs.
Example of using a control port
% ./zcon -e tcp://127.0.0.1:3333 Connecting to tcp://127.0.0.1:3333. zcon> help help this text version version info shutdown shutdown server quit close session zcon> version Current 0MQ version is 3.0.3 zcon> quit |
You can add commands, for example — to control or monitor your program — using the zcontrol API. Use the zcon client shown above to run these commands interactively. Or, you can invoke them from any ØMQ req socket. I’ve used zcontrol both ways, e.g. manually and from a PHP script.
For further information please see,
- the zcontrol documentation: http://troydhanson.github.com/zcontrol/
- or git clone git://github.com/troydhanson/zcontrol.git
Feedback or suggestions are welcome- feel free to leave comments.
Filed under: Software updates
kvspool, coming soon
Pending some final approvals, I’m looking forward to releasing a new library this week for streaming data processing. It’s a small API for spooling off events (key-value sets, aka a dictionary, hash, etc). From there, other programs can consume them (possibly at a slower rate, hence you can couple fast and slow processes, while limiting the total resource consumption of the buffer). Or you can publish the stream to remote subscribers, and it will continuously receive the events written to the local stream. It’s a tiny API in C, Python, Perl and Java.
Zero MQ
Just a few words of appreciation for Zero MQ (sorry, “ØMQ”). These guys have taken a new conceptual approach to sockets (layering an API over standard TCP/IP sockets) that takes care of most of the headaches in network programs. I won’t go into detail but I immediately used pub-sub, went from there to using req-rep, to push-pull, and I love that the API is tiny, the docs are great (I love the ASCII diagrams rendered nicely as PNG, I discovered they are using ditaa for that, by the way). I ported my controlport library for use with ZeroMq-message-loop based programs. Hope to post it here soon.
Entropy
For those who appreciate a dose of theory with their morning coffee, my notes about measuring entropy of a byte stream are on on Scripts & Snippets. Since I’m more interested in practical applications, here’s a program that calculates the ultimate percentage to which an input file could be compressed.
Filed under: snippets
My notes about UNIX domain sockets are now on Scripts & Snippets. Here’s the condensed version.
Communication within a host
UNIX domain sockets are a method by which processes on the same host can communicate. Communication is bidirectional with stream sockets and unidirectional with datagram sockets.
Identity
Instead of identifying a server by an IP address and port, a UNIX domain socket is known by a pathname. Obviously the client and server have to agree on the pathname for them to find each other. The server binds the pathname to the socket.
File permissions control who can connect
For UNIX domain sockets, file and directory permissions restrict which processes on the host can open the file, and thus communicate with the server. Therefore, UNIX domain sockets provide an advantage over Internet sockets (to which anyone can connect, unless extra authentication logic is implemented).
Comparison with named pipes for IPC
IPC within a UNIX host by may be accomplished by several other means including named pipes. What circumstances favor UNIX domain sockets versus pipes? The choice is influenced by these factors:
- Duplex
- Stream sockets provide bi-directional communication while named pipes are uni-directional.
- Distinct clients
- Clients using sockets each have an independent connection to the server. With named pipes, many clients may write to the pipe, but the server cannot distinguish the clients from each other– the server has only one descriptor to read from the named pipe. Because the named pipe has only read descriptor and possibly-multiple writers, random interleaving can also occur if a client writes more than PIPE_BUF bytes in one operation. Since pipes have these limitations, UNIX domain sockets should be used if there are multiple clients that need to be distinguishable or which write long messages to the server.
- Method of creating and opening
- Sockets are created using socket and assigned their identity via bind. Named pipes are created using mkfifo. To connect to a UNIX domain socket the normal socket/connect calls are used, but a named pipe is written using regular file open and write. That makes them easier to use from a shell script for example.
Linux Abstract Socket Namespace
Linux has a special feature: if the pathname for a UNIX domain socket begins with a null byte , its name is not mapped into the filesystem. Thus it won’t collide with other names in the filesystem. Also, when a server closes its UNIX domain listening socket in the abstract namespace, its file is deleted; with regular UNIX domain sockets, the file persists after the server closes it.
Resources
Here are some C programs that implement a UNIX domain socket client and server. These are placed in the public domain.
Filed under: Software updates
uthash v1.9.4 has been released.
Version 1.9.4 (2011-06-05)
- uthash now supports MurmurHash v3
- utlist now includes concatenation macros (LL_CONCAT and DL_CONCAT)
- utarray now supports binary search (utarray_find)
- utstring now supports a new-or-clear-existing macro (utstring_renew)
- documented technique for a multi-level hash table
- clarified scope requirements for UT_icd in the utarray documentation
- fixed termination when utstring_clear is followed by utstring_body
- fixed utarray_inserta macro when used with complex arguments
- on Visual Studio define missing type uint8_t
- Debian/Ubuntu include uthash in the package uthash-dev.
- uthash has been downloaded 16,211 times.
Thanks to Yu Feng, Richard Cook, Dino Ciuffetti, Chris Groer, and Arun Cherian for feedback and fixes in this release!
Edited to add: nice example of an LRU cache with uthash at http://jehiah.cz/a/uthash
Filed under: Software updates
Here’s an example of software just working well: I went to write a “startup script” for my process (you know, one of those init.d startup scripts that gets your process going when the box is rebooted). I learned that, on Ubuntu, the init process has been replaced with the very friendly “upstart” package.
Here’s how a startup script looks. This is for the acpid daemon:
# acpid - ACPI daemon # # The ACPI daemon provides a socket for other daemons to multiplex kernel # ACPI events from, and a framework for reacting to those events.
description "ACPI daemon"
start on runlevel [2345] stop on runlevel [!2345]
expect fork respawn
exec acpid -c /etc/acpi/events -s /var/run/acpid.socket
I copied this file (renaming it for my own process and adjusting the exec line), put the file in /etc/init/pm.conf and voilà– my work was done. I just had to start up the process manually the first time using,
service pm start
A few pointers:
- If you want to see the upstart logs, look in /var/log/daemon.log.
- upstart itself is documented in init(8) (man -s 8 init)
- the conf file syntax is documented in init(5) (man -s 5 init)
Filed under: snippets
On my Scripts and Snippets page, there are a dozen or so tiny C programs that each show usage of a particular library (like PCRE), technique (like a DNS lookup), tricky task (like writing a parser) or easy “boilerplate” code (like read some options and iterate over lines of a file).
These programs are there to jog my memory, or to give me something to quickly copy and modify when doing something similar in the future. Here’s some of the snippets that are there today. Maybe they’ll be useful to you if you need to see a quick, minimal complete program that does one of these tasks. They’re all placed in the public domain.
C snippets
- lemon using the Lemon parser generator
- dns doing DNS lookup
- udp UDP client and server
- tcp TCP client and server
- ifinfo list network interfaces, their IPv4/v6 addresses and MACs
- pcre using pcre library
- dlsym using dlsym
- readline using readline library
- looper iterating over lines of file
- slurp using read or mmap to slurp a whole file
- mworkers a sigsuspend-based parent process managing multiple worker processes
- bloom building a Bloom filter from lines of one file, and testing other lines on it
- zlib a simple gzip/gunzip implementation using the zlib C API
- tconf function to read configuration file entries
- iconv using the iconv library
Filed under: Software updates
uthash v1.9.3 has been released. This is a minor bug fix release.
Version 1.9.3 (2010-10-31)
- fix an ifdef for compatibility with Intel compiler (thanks, degski!)
- fix HASH_ITER macro to satisfy C++ casting rules (thanks, Erik Bai!)
Filed under: technotes
uthash v1.9.2 has been released. This is a minor enhancement release with a new iteration macro, a bug fix and a port of the hashscan utility to FreeBSD. Here is the ChangeLog entry:
Version 1.9.2 (2010-10-04)
- new HASH_ITER macro for more convenient deletion-safe iteration
- hashscan can now run on FreeBSD 8.1 and later (thanks, Markus Gebert!)
- More parens to evaluate complex macro arguments properly (thanks, ngg!)
- Add sz parameter to the uthash_free hook for platforms that do their own memory management. Hopefully this minor API change doesn’t cause too much breakage for people. (thanks, Niall Douglas!)
- uthash has been downloaded 12,294 times
Upcoming tpl release
I hope to make a release of tpl before the year is out. So, this would be a good time to submit any feature requests. Feel free to post them here or email me directly at tdh@tkhanson.net.
Filed under: technotes
In 2009 I put together a set of slides about Virtual Memory (pdf). Recently someone suggested I post them online, so here they are. This might be of interest to anyone doing Linux/Unix programming who’s struggled with questions like,
- How much memory does my process really take?
- What do RSS and VSIZE mean?
- If two processes share memory, does their RSS/VSIZE give a misleading picture of true memory usage?
- What does it mean for the kernel to “overcommit” memory? Why does it do that?
- When a process creates another process via fork, does all its memory get copied? What is copy-on-write?
- Can I watch these phenomena in action using system commands?
Now, I’m not a memory expert. If you need a deep reference on Linux kernel memory management, this is not the place to look. But if you’re a programmer looking for a 10 minute demystification of Linux virtual memory then you might find it helpful.
smem
Also, see this post on Blog O’ Matty about the smem utility. For newer kernels (> 2.6.27) this tool can report a new Proportional Set Size (PSS). This is an interesting way to deal with the shortcoming of RSS: when two processes share resident memory pages, RSS makes it look like they’re both consuming that much memory, but PSS will show each process consuming part of the total. To put that in perspective, start several bash shells and run smem. Here, you can see RSS makes it “look” like each shell uses over 3 mb of RAM, but PSS reveals that due to sharing each shell really uses less than 2 mb.
$ smem -P bash PID User Command USS PSS RSS 2521 hanson bash 1928 2029 3488 2538 hanson bash 1928 2029 3488 2555 hanson bash 1928 2029 3488
