Software updates by Troy D. Hanson


pmtr: a process monitor
November 8, 2011, 1:49 am
Filed under: Software updates

Starting processes at boot time, and keeping them running, is not glamorous but it’s important (if your computer gets power cycled, it’s nice to have your applications bounce right back… but I’m sure you already do this).

I put together a utility called pmtr (process monitor) to run my application daemons at boot time, and restart them as needed. I wanted to keep all of these jobs defined in one place– one configuration file. No rooting around in sysvinit files. Just my application’s jobs, all in one place. Like this,

/etc/pmtr.conf

  job {
    name weather-server
    cmd /usr/local/bin/weather -v
  }

  # this one keeps stderr and stdout
  job {
    name doppler-update
    cmd /usr/bin/dop-up
    dir /var/log
    err dop.err
    out dop.out
  }

That’s a simple pmtr.conf with just two jobs. No rocket science – just two jobs, in one place. There’s a handful of job configuration options it supports.

A friend asked me to support “deltas” if you change the configuration file- and this turned into one of the most useful features. It’s really two handy features that go together: pmtr monitors the file (using inotify) for changes so you don’t have to signal it when you make a change; then, when it rescans the configuration, it only fiddles with the jobs you’ve actually changed. You can disable or reconfigure jobs, or add new ones, and have that take effect without disrupting the other ones.

It’s not sysvinit, it’s not upstart, it’s not sophisticated. It’s just … handy.

You can read more about pmtr on it’s GitHub page.



ZeroMQ control port
October 30, 2011, 2:41 pm
Filed under: Software updates

I’ve posted my ØMQ zcontrol control port library to GitHub tonight. This makes it easy to embed an interactive control port in your ØMQ (v2 or v3)-based C programs.

Example of using a control port

% ./zcon -e tcp://127.0.0.1:3333
Connecting to tcp://127.0.0.1:3333.
zcon> help
help                 this text
version              version info
shutdown             shutdown server
quit                 close session

zcon> version
Current 0MQ version is 3.0.3

zcon> quit

You can add commands, for example — to control or monitor your program — using the zcontrol API. Use the zcon client shown above to run these commands interactively. Or, you can invoke them from any ØMQ req socket. I’ve used zcontrol both ways, e.g. manually and from a PHP script.

For further information please see,

Feedback or suggestions are welcome- feel free to leave comments.



Data streams, ZeroMq, entropy
October 9, 2011, 9:13 pm
Filed under: Software updates

kvspool, coming soon

Pending some final approvals, I’m looking forward to releasing a new library this week for streaming data processing. It’s a small API for spooling off events (key-value sets, aka a dictionary, hash, etc). From there, other programs can consume them (possibly at a slower rate, hence you can couple fast and slow processes, while limiting the total resource consumption of the buffer). Or you can publish the stream to remote subscribers, and it will continuously receive the events written to the local stream. It’s a tiny API in C, Python, Perl and Java.

Zero MQ

Just a few words of appreciation for Zero MQ (sorry, “ØMQ”). These guys have taken a new conceptual approach to sockets (layering an API over standard TCP/IP sockets) that takes care of most of the headaches in network programs. I won’t go into detail but I immediately used pub-sub, went from there to using req-rep, to push-pull, and I love that the API is tiny, the docs are great (I love the ASCII diagrams rendered nicely as PNG, I discovered they are using ditaa for that, by the way). I ported my controlport library for use with ZeroMq-message-loop based programs. Hope to post it here soon.

Entropy

For those who appreciate a dose of theory with their morning coffee, my notes about measuring entropy of a byte stream are on on Scripts & Snippets. Since I’m more interested in practical applications, here’s a program that calculates the ultimate percentage to which an input file could be compressed.



UNIX domain sockets
July 2, 2011, 8:35 pm
Filed under: snippets
scripts and snippets

My notes about UNIX domain sockets are now on Scripts & Snippets. Here’s the condensed version.


Communication within a host

UNIX domain sockets are a method by which processes on the same host can communicate. Communication is bidirectional with stream sockets and unidirectional with datagram sockets.

Identity

Instead of identifying a server by an IP address and port, a UNIX domain socket is known by a pathname. Obviously the client and server have to agree on the pathname for them to find each other. The server binds the pathname to the socket.

File permissions control who can connect

For UNIX domain sockets, file and directory permissions restrict which processes on the host can open the file, and thus communicate with the server. Therefore, UNIX domain sockets provide an advantage over Internet sockets (to which anyone can connect, unless extra authentication logic is implemented).

Comparison with named pipes for IPC

IPC within a UNIX host by may be accomplished by several other means including named pipes. What circumstances favor UNIX domain sockets versus pipes? The choice is influenced by these factors:

Duplex
Stream sockets provide bi-directional communication while named pipes are uni-directional.
Distinct clients
Clients using sockets each have an independent connection to the server. With named pipes, many clients may write to the pipe, but the server cannot distinguish the clients from each other– the server has only one descriptor to read from the named pipe. Because the named pipe has only read descriptor and possibly-multiple writers, random interleaving can also occur if a client writes more than PIPE_BUF bytes in one operation. Since pipes have these limitations, UNIX domain sockets should be used if there are multiple clients that need to be distinguishable or which write long messages to the server.
Method of creating and opening
Sockets are created using socket and assigned their identity via bind. Named pipes are created using mkfifo. To connect to a UNIX domain socket the normal socket/connect calls are used, but a named pipe is written using regular file open and write. That makes them easier to use from a shell script for example.

Linux Abstract Socket Namespace

Linux has a special feature: if the pathname for a UNIX domain socket begins with a null byte , its name is not mapped into the filesystem. Thus it won’t collide with other names in the filesystem. Also, when a server closes its UNIX domain listening socket in the abstract namespace, its file is deleted; with regular UNIX domain sockets, the file persists after the server closes it.


Resources

Here are some C programs that implement a UNIX domain socket client and server. These are placed in the public domain.



uthash 1.9.4 released
June 5, 2011, 12:24 pm
Filed under: Software updates
uthash-mini.png

uthash v1.9.4 has been released.


Version 1.9.4 (2011-06-05)

  • uthash now supports MurmurHash v3
  • utlist now includes concatenation macros (LL_CONCAT and DL_CONCAT)
  • utarray now supports binary search (utarray_find)
  • utstring now supports a new-or-clear-existing macro (utstring_renew)
  • documented technique for a multi-level hash table
  • clarified scope requirements for UT_icd in the utarray documentation
  • fixed termination when utstring_clear is followed by utstring_body
  • fixed utarray_inserta macro when used with complex arguments
  • on Visual Studio define missing type uint8_t
  • Debian/Ubuntu include uthash in the package uthash-dev.
  • uthash has been downloaded 16,211 times.

Thanks to Yu Feng, Richard Cook, Dino Ciuffetti, Chris Groer, and Arun Cherian for feedback and fixes in this release!

Edited to add: nice example of an LRU cache with uthash at http://jehiah.cz/a/uthash



upstart: init in Ubuntu
February 28, 2011, 6:04 pm
Filed under: Software updates

Here’s an example of software just working well: I went to write a “startup script” for my process (you know, one of those init.d startup scripts that gets your process going when the box is rebooted). I learned that, on Ubuntu, the init process has been replaced with the very friendly “upstart” package.

Here’s how a startup script looks. This is for the acpid daemon:

# acpid - ACPI daemon
#
# The ACPI daemon provides a socket for other daemons to multiplex kernel
# ACPI events from, and a framework for reacting to those events.
description   "ACPI daemon"
start on runlevel [2345]
stop on runlevel [!2345]
expect fork
respawn
exec acpid -c /etc/acpi/events -s /var/run/acpid.socket

I copied this file (renaming it for my own process and adjusting the exec line), put the file in /etc/init/pm.conf and voilà– my work was done. I just had to start up the process manually the first time using,

service pm start

A few pointers:

  • If you want to see the upstart logs, look in /var/log/daemon.log.
  • upstart itself is documented in init(8) (man -s 8 init)
  • the conf file syntax is documented in init(5) (man -s 5 init)


A sampling of C snippets
December 19, 2010, 5:38 pm
Filed under: snippets
snippets.png

On my Scripts and Snippets page, there are a dozen or so tiny C programs that each show usage of a particular library (like PCRE), technique (like a DNS lookup), tricky task (like writing a parser) or easy “boilerplate” code (like read some options and iterate over lines of a file).

These programs are there to jog my memory, or to give me something to quickly copy and modify when doing something similar in the future. Here’s some of the snippets that are there today. Maybe they’ll be useful to you if you need to see a quick, minimal complete program that does one of these tasks. They’re all placed in the public domain.


C snippets

  • lemon using the Lemon parser generator
  • dns doing DNS lookup
  • udp UDP client and server
  • tcp TCP client and server
  • ifinfo list network interfaces, their IPv4/v6 addresses and MACs
  • pcre using pcre library
  • dlsym using dlsym
  • readline using readline library
  • looper iterating over lines of file
  • slurp using read or mmap to slurp a whole file
  • mworkers a sigsuspend-based parent process managing multiple worker processes
  • bloom building a Bloom filter from lines of one file, and testing other lines on it
  • zlib a simple gzip/gunzip implementation using the zlib C API
  • tconf function to read configuration file entries
  • iconv using the iconv library


uthash 1.9.3 released
October 31, 2010, 4:08 pm
Filed under: Software updates
uthash-mini.png

uthash v1.9.3 has been released. This is a minor bug fix release.


Version 1.9.3 (2010-10-31)

  • fix an ifdef for compatibility with Intel compiler (thanks, degski!)
  • fix HASH_ITER macro to satisfy C++ casting rules (thanks, Erik Bai!)


uthash 1.9.2 released
October 4, 2010, 3:20 am
Filed under: technotes
uthash-mini.png

uthash v1.9.2 has been released. This is a minor enhancement release with a new iteration macro, a bug fix and a port of the hashscan utility to FreeBSD. Here is the ChangeLog entry:


Version 1.9.2 (2010-10-04)

  • new HASH_ITER macro for more convenient deletion-safe iteration
  • hashscan can now run on FreeBSD 8.1 and later (thanks, Markus Gebert!)
  • More parens to evaluate complex macro arguments properly (thanks, ngg!)
  • Add sz parameter to the uthash_free hook for platforms that do their own memory management. Hopefully this minor API change doesn’t cause too much breakage for people. (thanks, Niall Douglas!)
  • uthash has been downloaded 12,294 times

Upcoming tpl release

I hope to make a release of tpl before the year is out. So, this would be a good time to submit any feature requests. Feel free to post them here or email me directly at tdh@tkhanson.net.



Virtual Memory in a nutshell
July 18, 2010, 11:15 am
Filed under: technotes

In 2009 I put together a set of slides about Virtual Memory (pdf). Recently someone suggested I post them online, so here they are. This might be of interest to anyone doing Linux/Unix programming who’s struggled with questions like,

  • How much memory does my process really take?
  • What do RSS and VSIZE mean?
  • If two processes share memory, does their RSS/VSIZE give a misleading picture of true memory usage?
  • What does it mean for the kernel to “overcommit” memory? Why does it do that?
  • When a process creates another process via fork, does all its memory get copied? What is copy-on-write?
  • Can I watch these phenomena in action using system commands?

Now, I’m not a memory expert. If you need a deep reference on Linux kernel memory management, this is not the place to look. But if you’re a programmer looking for a 10 minute demystification of Linux virtual memory then you might find it helpful.

smem

Also, see this post on Blog O’ Matty about the smem utility. For newer kernels (> 2.6.27) this tool can report a new Proportional Set Size (PSS). This is an interesting way to deal with the shortcoming of RSS: when two processes share resident memory pages, RSS makes it look like they’re both consuming that much memory, but PSS will show each process consuming part of the total. To put that in perspective, start several bash shells and run smem. Here, you can see RSS makes it “look” like each shell uses over 3 mb of RAM, but PSS reveals that due to sharing each shell really uses less than 2 mb.

$ smem -P bash
 PID User    Command        USS      PSS      RSS
2521 hanson  bash          1928     2029     3488
2538 hanson  bash          1928     2029     3488
2555 hanson  bash          1928     2029     3488



Follow

Get every new post delivered to your Inbox.