Roger B. Dannenberg

Home Publications Videos Opera Audacity
Dannenberg playing trumpet.
Photo by Alisa.

Building Music Systems with O2 and O2lite


Summary

The O2 protocol extends the real-time messaging features of Open Sound Control with new capabilities including named services, discovery, clock synchronization and timed messages, reliable message transmission, and publish/subscribe capabilities. Recent work has extended O2 with a light-weight protocol to extend O2 capabilities to devices that lack a full implementation of TCP/IP. The new protocol, O2lite enables connectivity with small microcontrollers over WiFi, web browsers over WebSockets, and even with threads that communicate through shared memory.

This article describes an application of O2 and O2lite. It is written to accompany a demo session at NIME 2022.

About the Author

Roger B. Dannenberg is Emeritus Professor of Computer Science at Carnegie Mellon University. He is known for a broad range of research in Computer Music, including the creation of interactive computer accompaniment systems, languages for computer music, music understanding systems, and music composing software. He is a co-creator of Audacity, perhaps the most widely used music editing software.

Additional References

O2 source code is open and free.

A video prepared for ICMC 2022 demonstrates location independence and discovery features of O2.

Articles on O2 are listed in my bibliography.

Researchers and creators, whose first priority may be to design human-computer interfaces, must often create computer-computer interfaces to:

These are just some of the many practical networking scenarios that occur in our practice.

There are lots of solutions. ("The great thing about standards is there are so many to choose from!" - Andrew Tanenbaum.) O2 adopts many ideas from Open Sound Control. O2 was created to solve many remaining problems. While OSC is only a specification for network messages, O2 is a protocol and API that includes the automatic formation of peer-to-peer networks, clock synchronization and much more (see the Summary and Additional References at left).

Video Demonstration

Without further ado, here's a video of the system in action:

In summary, we have created a system from 4 components, each using a different technology (microcontroller, browser, Serpent application, Soundcool) and shown how O2 provides the “glue” to put it all together.

Why O2lite?

The added complexity of O2 comes at a cost. It is not simple to provide the full O2 functionality on the smallest microcontrollers, such as Arduino and ESP32 systems, or on systems without full TCP/IP stacks such as web browsers. Thus, my goal of simple “universal” peer-to-peer discovery and real-time messaging falls short in some important use cases.

O2lite is basically a subset of O2. An O2lite process connects to an O2 host (protocols include WebSockets, TCP/IP and high-performance lock-free queues in shared memory) and uses it to route messages to and from an entire O2 peer-to-peer network. A minimal implementation is much smaller than O2, but most O2 services are still available, even if they are one hop away in the host process. Usually, the host would be an O2 application running on a laptop.

Great Features of O2

Here is a list of capabilities that are powerful and somewhat distictive in O2:

Discovery

Discovery means that O2 processes find and connect themselves without special configuration by the user. There is no typing of IP addresses or configuration so specify URLs, local vs. remote network, etc. O2 now uses Bonjour (aka Avahi on Linux), which has a very solid and scalable design and even runs on microcomputers.

Service-based Addressing

O2 networks are organized around named services. To send the /play/v1 command to a synthesizer, we give the synthesizer a service name, e.g., synth and use this address for our command: /synth/play/v1. The first node in an O2 address always names a service, and O2 uses services to locate the service provider, which can be anywhere. You do not have to change code to reconfigure your system.

Property Lists

Sometimes, you may not be sure what service you want or what features it has. Maybe services represent laptop performers and you want to know the stage location of player "xyzzy." xyzzy can execute
o2_service_set_property("xyzzy", "xloc", "0.3");
o2_service_set_property("xyzzy", "yloc", "0.8");

and other O2 applications can retrieve those properties without sending request messages, waiting for replies, etc.

Wide Area Networks

O2 applications are not limited to local area networks where Bonjour works for discovery. To make global connections, O2 uses a third-party "broker" running the MQTT IOT protocol for discovery, and if necessary to get through firewalls and NAT (where you do not have a public IP address), O2 messages can be forwarded through MQTT.

An O2 Demonstration

Before diving into details, here's our objective: I have an algorithm that makes sound by triggering percussion sounds while varying playback speed and feedback delay parameters. I want to send OSC messages to control Soundcool, which does all the audio. I want to control the algorithm interactively with an accelerometer/gyro IMU running on a microcontroller. Finally, I want a graphical interface in a browser written in p5.js. How do I put these together?

This figure illustrates the system, constructed with O2 as “glue” to connect the various processes:

We will go through each component: The sensors, the control program, Soundcool, and finally the browser interface.

O2lite for ESP32/Arduino

Let’s move on to some examples. I created this sensor using an ESP32 Thing and Motion Shield from SparkFun. Running O2lite, the ESP32 discovers an O2 process running on my laptop, connects, and sends accelerometer data to service imu (Inertial Measurement Unit).

Code for the ESP32

To give a sense of what things look like on the ESP32, here are some highlights with commentary. First, we initialize the network and O2 connection. Unfortunately, we have to compile in the network name and password as with any Wi-Fi device. Once on the network, discovery kicks in. O2 processes name an ensemble and only processes in the same ensemble will connect and communicate. We are going to join the "scimu" ensemble:
#include "o2lite.h" // include the O2lite API

void setup()
{
    connect_to_wifi(HOSTNAME, NETWORK_NAME, NETWORK_PSWD);  // connect to Wi-Fi
    o2l_initialize("scimu");  // connect to the "scimu" ensemble
    ...
}
In the loop, we just need to call o2l_poll(), which runs O2lite, checking for incoming messages and also running clock synchronization. This application also polls for IMU sensor data:
void loop()
{
    ...  // the main polling loop runs forever
    o2l_poll();  // O2lite processing
    imu_loop();  // receive data from IMU
}

We send IMU data every 50ms by calling this function from imu_loop(). An O2 message is assembled by calling o2l_send_start() with the O2 address (the remote service receiving the message is "imu," which is automatically discovered), the timestamp (0 means "now"), the type string (send 7 floats), and the reliable flag: false means send over UDP. Since we'll send an update soon, there's no need to retransmit dropped Wi-Fi messages.

Next, we add the parameters by calling o2l_add_float() 7 times. Finally we send the message with o2l_send():

void send_sensor_readings()
{
    o2l_send_start("/imu/imu", 0, "fffffff", false);  // begin message
    o2l_add_float(runTime);
    o2l_add_float(imu.calcAccel(imu.ax));
    o2l_add_float(imu.calcAccel(imu.ay));
    o2l_add_float(imu.calcAccel(imu.az));
    o2l_add_float(imu.calcGyro(imu.gx));
    o2l_add_float(imu.calcGyro(imu.gy));
    o2l_add_float(imu.calcGyro(imu.gz));
    o2l_send();
}
Note that there's no hard-coded IP address, so you can run the "imu" service anywhere on the network and O2lite will find it. This is especially convenient with laptops and Wi-Fi, where you rarely have a fixed IP address, and there is no way to give a new IP address to the ESP32 unless you want to recompile and reload the ESP32 software.

The Control Program

The control program is written in Serpent, which is my own Python-like language for real-time music systems. The language details are not important, but here's how it works with O2 (it would be similar in C or C++).

Initialization

Since this will all run on a private Wi-Fi network, the initialization routine disables O2 Internet connections, which allows O2 to start without waiting for a public IP address. Next, we tell O2 that we are providing the time reference for the ensemble. O2 is initialized and some services and message handlers are installed. To connect to Soundcool, we need to become an OSC client. O2 uses o2_osc_delegate() to say that the "soundcool" service is handled by an OSC server at address localhost:8000. Finally, we enable the built-in web server to allow WebSocket connections.

def startup()
    o2_internet_enable(false)  // local area network only
    o2_clock_set()  // we are the time reference
    o2_initialize("scimu", false)  // start O2

    o2_service_new("imu")  // create service
    o2_service_new("ctrl") // create service
    o2_method_new("/imu/imu", "fffffff", 'imu_handler', true)
    ... more handlers are installed here...
    o2_osc_delegate("soundcool", "localhost", 8000, false)
    o2_http_initialize(8080, "www")

Receiving Messages

The control program gets messages from the sensor and from the browser-based user interface. A simplified handler for IMU data is shown below. imu_handler is called when a message is received for /imu/imu, as specified by the o2_method_new call (see above).

The function send_afloat shows how we send OSC messages to Soundcool. The OSC address is given by address. We simply prepend the service name (/soundcool), and O2 does the rest according to the previous o2_osc_delegate call (see above).

def imu_handler(timestamp, path, types, rest parameters):
    var roll_ga = parameters[1]
    var pitch_ga = parameters[2]

    var roll_gam = apply_map("roll", roll_ga)
    var pitch_gam = apply_map("pitch", pitch_ga)
  
    if roll_gam > 1:
        ptrig("1")  // trigger sound 1
    ... more conditions/responses...

def send_afloat(address, x)
    o2_send_start()
    o2_add_float(x)
    o2_send_finish(0.0, "/soundcool" + address, UDP)

Handlers for the graphical user interface are similar even though these arrive via WebSocket since everything is based on O2 messages.

Web-Based Graphical Interface

We've already seen how to receive from the browser (using o2_http_initialize described earlier.) On the browser side, we use JavaScript to send O2lite messages. The key points are shown below.

Initialization

O2lite functions in Javascript are prefixed with o2ws_ to emphasize that this is the WebSockets interface. We start O2lite processing with o2ws_initialize("scimu"). The _o2 service is built-in and exists in every O2 and O2lite process. Normally, this is used for internal O2 management, but a special case is that in O2lite, we can receive status updates by handling messages to /_o2/st as shown here.

The status_handler function tests for status changes on the ctrl service, which is offered by the control program (see o2_service_new("ctrl") in the Control Program initialization earlier).

function app_init() {
    o2ws_status_msgs_enable = true;
    o2ws_status_msg("Initializing O2ws with " + WSURI);
    o2ws_initialize("scimu");
    // create handler for server status messages
    o2ws_method_new("/_o2/st", "si", true, status_handler, null);
    server_found = false;
    poll_for_server();
}

function status_handler(timestamp, address, typespec, info) {
    var service_name = o2ws_get_string();
    var service_status = o2ws_get_int32();
    if (service_name == "ctrl" && service_status >= O2_LOCAL_NOTIME) {
        server_found = true;
    }
}

With O2lite, to receive service status messages, we need to ask for them by sending a message. This is the function of poll_for_server, called above in app_init. In the following simplified implementation, we ask for status every 250ms until the ctrl service is ready.

function poll_for_server() {
    if (!server_found) {
        o2ws_send_cmd("/_o2/ws/st", 0, "s", "ctrl");
        setTimeout(poll_for_server, 250); 
    }
}

Sending Mapping Parameters

Most of the graphical interface code is an implementation of the bars shown in the figure above. The user can drag ends of bars left and right. The top bar represents the input range. The bottom bar represents the output range. But changing the bars, we adjust a linear mapping from input to output. E.g. the middle bars ("rollrms") in the figure say that the input range (0 to 1) is mapped to the output range 0 to 0.48.

The entire mapping is represented by the formula y = mx + b, so all we need to send are the name of the map, m and b. The following method of class Mapping is called in our p5.js implementation:

    send_update() {
        o2ws_send_start("/ctrl/map", 0, "sff", false);
        o2ws_add_string(this.name);
        ... calculation of m and b are omitted ...
        o2ws_add_float(m);
        o2ws_add_float(b);
        o2ws_send_finish();
    }

Summary and Conclusions

O2lite is a variation of the O2 protocol, which is designed to allow O2-based applications to interoperate with new devices and interfaces. O2lite has been implemented using WebSockets to allow browser-based graphical interfaces including those on tablets and smart phones. O2lite also has a TCP/IP implementation that runs on ESP32 microcontrollers, and it is easily adapted to other microcontrollers.

O2lite is simple to implement, and O2lite support for new transports can be added modularly to the O2 library. Although O2lite does not directly offer the same power and peer-to-peer connectivity as O2, O2lite obtains connectivity indirectly by using an O2 process as its gateway to a peer-to-peer O2 network. Once the network is reached, O2lite messages are be routed to their destinations, which can be on the same machine, on the local area network, across the global Internet, another O2lite process, or even offered by an OSC server.

Lessons Learned

The original goal of O2 was high connectivity, discovery and advanced functions, but it was only to run over TCP/IP and even with that limitation, the implementation is complex. O2lite has the advantage that the implementation is small and simple, making it practical to provide many implementations supporting microcontrollers, WebSockets and shared memory interfaces, as well as many languages such as Python, Java, C#, Go, Swift, Ruby, Rust, etc.

Discovery and named services have eliminated much of the pain of networking. With O2, manually assigning IP addresses and port numbers, and carefully (re)starting servers before clients has become a thing of the past. The greater reliability and robustness of O2 leaves more time for using communication and creating music.

Future Development

At present, there is no integration of O2 and Pd or Max MSP. Given the popularity of these platforms in the computer music community, support here is critical and a high priority for future development. Another interesting direction is to use the WebSockets interface to create portable tools for NIME development. Using service queries and taps, a debugging interface can be constructed that attaches to a network and allows the user to “snoop” on O2 messages. Debugging distributed systems is notoriously difficult, and this kind of exploratory and monitoring tool could be invaluable. A second useful tool would be an interactive designer for interfaces, where a user could place interactive controls (buttons, sliders, dials, etc.) and associate them to O2 address strings, similar to Interface Builder or the TouchOSC editor.

Additional references appear in the left sidebar (scroll up).