netfilter project logo Planet Netfilter

May 07, 2012

Harald Welte: Some follow-up on the Osmocom Berlin meetings

We've now had the first two incarnations of the Osmocom Berlin User Group Meeting. The start was great, and we had probably something around 10 attendees. Some were the usual suspects like the various Osmocom developers living in Berlin. But we also had a number of new people attending each of both of the meetings, which is good.

To my big surprise people are even flying in from other parts of Europe in order to be able to attend. Last time from Sweden, and for the next meeting some folks from the Netherlands have announced themselves.

To an even bigger surprise, the attendee from Sweden announced that he is working for an Ericsson research lab, and apparently they are using OsmocomBB quite a bit inside that lab. They think it's a great tool, and apparently nothing else with the same flexibility (i.e. full source code) is at their hands that can compete.

On the one hand it is surprising to see such a large traditional Telco supplier to start to use such amateur tools like OsmocomBB, which definitely have not had even a fraction of the testing (particularly with various operators in various countries) like the commercial protocol stacks.

On the other hand, if you think more about it, Ericsson is entirely a network equipment supplier today. They have spun off their baseband processor business to become part of ST-Ericsson, they have pulled out of Sony-Ericsson, sold their TEMS product line to Ascom and other bits and pieces to Tieto. So right now, if they need a MS-side protocol stack or engineering phones, they probably have to obtain what is available on the market. And that's unfortunately not all that great, as the products are either

  • Measurement devices aimed at mostly L1 testing / QA (Racal, Agilent, Rohde-Schwarz)
  • Trace mobiles primarily aimed at field testing (TEMS, Sagem OT) and while they provide traces they don't permit you to send arbitrary data or behave spec-incompliant
  • Mobile Phone development platforms (Qualcomm, MTK, Infinenon, ...) which don't necessarily give you the full source code to the stack, and are only available if you actually intend to build a handset

So all in all, the more I think about it, it is actually not too surprising that they ended up with OsmocomBB. It's free (as in free beer) and they get the full source code with it. You need a lot of skills and time to get it running and find your way around how to use it, but I guess if you're working in cellular protocols and embedded systems, it's not that hard.

April 28, 2012

Eric Leblond: Doing miracles with darktable and gimp

I’ve worked on a picture of a Volkswagen Beetle using Darktable and Gimp for post processing. This two tools are free available free software. Darktable is for now only available on Linux but Gimp is available for most platform.

The picture was made during autumn 2011 in San Francisco. It features an old Volkswagen Beetle in a parking near a house. There is an old cover on the car which gave a strange pirat look to the car. The picture straight out of the camera is the following:

A Beetle is a fun car and I did not want to make a sad picture from this photography. I thus decide to make my best to transform the original picture in a shiny colorful seventies spirit image. Most technical parameters of the RAW file are correct: The exposure is almost good if we omit some highlights and sharpness is here too. It is thus a good basis to work on.

The main problem of the image is that the car is dark. To enlighten the dark part of an image, on of the best tool of darktable is “shadows and highlights”.

Just activating it with default settings in enough to improve the situation but here we need to increase the effect. I thus set the shadow value to something like 170% to get a really lightful car. The color become light but not enough flashy. It then used the “color zone” filter to boost the color of the car by increasing the saturation for the main color of the car. I’ve also increase the light in the blue to have a less dark house.

This was enough to have what I wanted in term of color but I also wanted to give on oldies look to the picture. For that, I used the soften filter:

I’ve not done much personal setting here, just changing the size parameter.

The result was matching my expectation:

Hummm, almost matching. The sun light is making a too bright zone in front of the car. I thus decide to use gimp to locally decrease the lightness in that zone.

I opened the JPEG export in Gimp. I used the “Fuzzy selector tool” (activating the “Feather edges” option) to select the highlight zone. I copied this selection into a new layer and simply decrease the lightness of the extracted zone.

The result was an old but shiny Beetle. You can see it on 500px:

Back to the seventies. by Eric Leblond (regit) on 500px.com
Back to the seventies. by Eric Leblond

April 27, 2012

Eric Leblond: Building Suricata for OpenBSD 4.9 and over

It seems OpenBSD upgrade are done to give maintenance work to the developers of third-party application. In a way, OpenBSD fight against the economic crisis: It gives jobs to developers and if you want some performance you need a powerful thus new computer.

Let’s stop bashing and be serious: Suricata was building fine on OpenBSD 4.8 but the build was failing on subsequent version. This was link with an include modification around the “socket.h” file. It is now mandatory to include “types.h” before “socket.h” to avoid compilation error. The patch 0001-Fix-OpenBSD-compilation.patch.gz fixes the build.

I’ve also updated my Building Suricata under OpenBSD post which does explain installation of dependencies which are now mandatory since Suricata 1.2.

April 09, 2012

Harald Welte: Prototype smart card chips in DIL-40 case have arrived

Finally, the first samples of the smart card chip (for the Osmocom CardOS project) have arrived. As opposed to the final smart cards, this one has been packaged in a DIL case instead of the usual thin credit-card sized plastic. The reason for this is quite simple: This way lots of I/O pins for debugging as well as JTAG can be accessible during COS development.

Here you can see the first incarnation of a veroboard connected to an adapter pcb inside an Omnikey smart card reader:

After confirming it worked, I soldered the wires directly to the adapter PCB, as can be seen here:

There is already a real PCB design that is currently manufactured, i.e. in a week or so there will be a picture of a clean, professionally-produced/etched PCB with all of the prototype pins exported.

In terms of the COS, I haven't done much more work than compared to the last posting, mainly due to a large number of other projects. But we will get there...

Harald Welte: Name that UART: April 2012

It's sort of a cheap knock-off idea stolen from the Name that Ware on bunnies blog: I'm going to post one picture every month about a UART that I found on embedded hardware. Unfortunately I don't have much to offer in terms of a reward for whoever finds the true solution ;)

In any case, every month there are devices that I'm looking into either out of my own interest, or because the work at gpl-violations.org requires it. In most of them, you can find a UART to get to the u-boot / Linux serial console.

So here is the device that I just took apart earlier today:

The location of the UART pads was obvious, after looking at the PCB for a very short time. The entire unpopulated U1 footprint appeared suspiciously like a UART level shifter for true RS232 voltage levels:

  • You can see two signals going directly to a small unpopualted3-pin header
  • There are two other signals coming from somewhere under the main SoC
  • There are capacitors (C440, C441) directly connected to the U1 for the charge pump

March 29, 2012

Rusty Russell: 1 Week to Go, and Rusty Goes Offline

Just as the Linux kernel merge window closes, I’m going offline.  My wedding is exactly a week away, but I’ll be entertaining guests and doing final preparation.  I’ll be back from our honeymoon and wading through mail on the 7 May.

Alex’s “A Bald Target” campaign to raise awareness for TimeForKids has been a huge success, even though we’re currently far short of the hair-shaving goal.  She’s been on one of the local radio stations, with newspaper coverage expected this weekend; two local TV stations want to cover the actual shave if it happens.  The charity is delighted with the amount of publicity they have received; given that they need local people to volunteer to mentor the disadvantaged children, that’s worth at least as much as the money.

Special thanks to a couple of people who donated direct to the charity, to avoid causing baldness!  And yes, if we were starting again, having competing “shave” vs “save” campaigns would have been awesome…

Rusty Russell: Sources of Randomness for Userspace

I’ve been thinking about a new CCAN module for getting a random seed.  Clearly, /dev/urandom is your friend here: on Ubuntu and other distributions it’s saved and restored across reboots, but traditionally server systems have lacked sources of entropy, so it’s worth thinking about other sources of randomness.  Assume for a moment that we mix them well, so any non-randomness is irrelevant.

There are three obvious classes of randomness: things about the particular machine we’re on, things about the particular boot of the machine we’re on, and things which will vary every time we ask.

The Machine We’re On

Of course, much of this is guessable if someone has physical access to the box or knows something about the vendor or the owner, but it might be worth seeding this into /dev/urandom at install time.

On Linux, we can look in /proc/cpuinfo for some sources of machine info: for the 13 x86 machines my friends on IRC had in easy reach, we get three distinct values for cpu cores, three for siblings, two for cpu family, eight for model, six for cache size, and twelve for cpu MHz.  These values are obviously somewhat correlated, but it’s a fair guess that we can get 8 bits here.

Ethernet addresses are unique, so I think it’s fair to say there’s at least another 8 bits of entropy there, though often devices have consecutive numbers if they’re from the same vendor, so this doesn’t just multiply by number of NICs.

The amount of RAM in the machine is worth another two bits, and the other kinds of devices eg. trolling /sys/devices, which can be expected to give another few bits, even in machines which have fairly standard hardware settings like laptops.  Alternately, we could get this information indirectly by looking at /proc/modules.

Installed software gives a maximum three bits, since we can assume a recent version of a mainstream distribution.  Package listings can also be fairly standard, but most people install some extra things so we might assume a few more bits here.  Ubuntu systems ask for your name to base the system name on, so there might be a few bits there (though my laptop is predictably “rusty-x201″).

So, let’s have a guess at 8 + 7 + 2 + 3 + 3 + 2 + 2, ie. 27 bits from the machine configuration itself.

Information About This Boot

I created an upstart script to reboot (and had to hack grub.conf so it wouldn’t set the timeout to -1 for next boot), and let it loop for a day: just under 2000 times in all. I eyeballed the graphs of each stat I gathered against each other, and there didn’t seem to be any surprising correlations.   /proc/uptime gives a fairly uniform range of uptime values within a range of 1 second, at least 6 bits there (every few dozen boots we get an fsck, which gives a different range of values, but the same amount of noise).  /proc/loadavg is pretty constant, unfortunately.  bogomips on CPU1 was fairly constant, but for the boot CPU it looks like a standard distribution within 1 bogomip, in increments of 0.01: say another 7 bits there.

So for each boot we can extract 13 bits from uptime and /proc/cpuinfo.

Things Which Change Every Time We Run

The pid of our process will change every time we’re run, even when started at boot.  My pid was fairly evenly divided on every value between 1220 and 1260, so there’s five bits there.  Unfortunately on both 64 and 32-bit Ubuntu, pids are restricted to 32768 by default.

We can get several more bits from simply timing the other randomness operations.  Modern machines have so much going on that you can probably count on four or five bits of unpredictability over the time you gather these stats.

So another 9 bits every time our process runs, even if it’s run from a boot script or cron.

Conclusion

We can get about 50 bits of randomness without really trying too hard, which is fine for a random server on the internet facing a remote attacker without any inside knowledge, but only about five of these bits (from the process’ own timing) would be unknown to an attacker who has access to the box itself.  So /dev/urandom is still very useful.

On a related note, Paul McKenney pointed me to a paper (abstract, presentation, paper) indicating that even disabling interrupts and running a few instructions gives an unpredictable value in the TSC, and inserting a usleep can make quite a good random number generator.  So if you have access to a high-speed, high-precision timing method, this may itself be sufficient.

March 26, 2012

Harald Welte: h-online article covering OpenBTS and OpenBSC

You can find a 3-page article about OpenBTS, OpenBSC and related projects available from the h-online web site.

Harald Welte: OsmoDevCon 2012 is over...

We just finished the 4th and final day of the OsmoDevCon 2012. It contained four days of in-depth presentations and discussions related to Free Software communications systems, most notably OsmocomBB, OpenBSC, OpenBTS, OsmoNITB, SIMtrace, OsmoGMR, OsmoSDR, rtl-sdr and many more.

I think it was a great chance to make sure the key developers involved with those projects are up-to-date with what everyone else is hacking on. I was especially happy with the presentations of Holger's smalltalk implementation of certain GSM protocols/interfaces, and it seems my small informal Erlang intro has raised some interest.

If anything, the 4-day conference has shown that there is a massive amount of work going on in the various different projects, and that it has clearly grown beyond anything that a single person could still be involved in all the sub-projects.

Personally, I'm happy to see what has grown out of this "we have a BS-11, let's see what we can do with it" that Dieter and I started in 2008. Now we're no longer talking about BTS/A-bis/BSC, but about SS7, MSC, TCAP/MAP, SCCP, HLR, Erlang, smalltalk, DECT, SIM/USIM, COS, SDR, GMR/Thuraya, TETRA and more recently also femtocells as well as NodeBs.

In the spirit of that 2008 presentation Running your own GSM network using the BS-11, Dieter Spaar has now demonstrated his talk on Running your own UMTS network, using NSN or Ericsson NodeBs. I'm really excited to see where that will take us - despite the fact that due to the 5 MHz wide channels, it's pretty close to impossible to get the experimental spectrum licenses that most of us have been able to get in recent years for our work.

As an outlook, over the remaining year 2012, I see progress in the following areas:

  • osmo-nitb will get a VLR/HLR split (async database access)
  • we will build a stand-alone osmo-msc with A interface
  • the signerl TCAP/MAP implementations will be used in production
  • OsmoSDR firmware will be completed, the hardware will start shipping
  • a new card operating system (OsmoCOS) will emerge
  • a UMA gateway will be implemented
  • a Free Software GPRS/EDGE PCU and RLC/MAC implementation will appear
  • last but not least, sysmoBTS will start commercial shipment really soon now

I'd like to thank our host c-base for having us block their conference room for 4 days, as well as all attendees who have travelled from all parts of Europe, but even the United States and Russia to participate. There definitely will be another OsmoDevCon, though we don't know yet at which point in time.

March 18, 2012

Harald Welte: Using a cheap (USD 20) DVB-T USB stick as SDR receiver for (not only) gnuradio

Fellow Osmocom hacker Steve Markgraf has been working on what now seems to be the cheapest way to receive real-world radio signals for PC-based SDR like gnuradio: rtl-sdr. RTL refers to the RTL2832U chipset frequently used in such devices. It can be used to obtain 2.8 Ms/s of 8-bit I+Q samples.

Below is a picture (courtesy of Steve) how the hardware looks like:

March 16, 2012

Harald Welte: Osmocom GPS timing source with u-blox LEA6-T

Recently we have been looking for an inexpensive way to generate a high-accuracy clock source for E1 lines, as it is required by a number of classic BTSs that don't have a sufficiently accurate OCXO built-in.

Luckily, the Digium E1 cards like TE-410P have a timing connector, to which an 8.192 MHz signal can be injected. Unfortunately, there don't seem to be any OCXOs around for that frequency. That's where the u-blox LEA-6T comes into play: It has a configurable TIMEPULSE2 output that can generate any frequency up to 10 MHz. We use this in our board to generate 8.192 Mhz and want to feed that into the Digium card.

So all we had to do is build a small board that contains the module and connector for antenna input, clock output and the obligatory 2.5mm stereo jack for the OsmocomBB-style UART:

Thanks to Sylvain for doing the schematics/PCB design, and thanks to Pablo for writing the code to configurea and activate the 8.192MHz output.

Once the design is verified, the schematics + gerber will be available, as well as board from the sysmocom webshop.

March 14, 2012

Harald Welte: Alcatel MTK phone UART pinout

The Alcatel OT-890D is a MT6573 based smartphone. It seems one of the UARTs is available on test pads as seen in this picture:

The voltage level is still 3.3V, so no fancy 1.8V gear is required.

During boot, the UART is first used at 19200 bps, where it prints the strings "MW01" and "MW02". I then switches to 115200 bps where it prints "READY", and finally switches to 921600 bps, where it seems to output some mixed binary/text messages containing AT commands and responses between AP and BP, as well as some debug information:

�Ue� � � T+CREG=2
�Ue�!�!�!T+CSQSQ=1
�Ue�!�!�!AT+CREG=2
�Uew�"w�"w�"SQSQ=1
'Ue"""      AT+EFUN=1
      SML: Load!_Ue""""""
                         SML: Load!hU("("("

I haven't yet investigated if the binary between the text is some standard HDLC framing or a TS 07.10 multiplex.

If anyone knows more about the boot process (MW01/MW02/READY) or the binary protocol, please let me know.

March 09, 2012

Eric Leblond: Playing with Network Layers to Bypass Firewalls’ Filtering Policy

The slides of my CansecWest talk can now be downloaded: Playing with Network Layers to Bypass Firewalls’ Filtering Policy.

The required counter-measures are described in the Secure use of iptables and connection tracking helpers document

The associated video demonstrations are available:

First video demonstrates how to use forged IRC protocol command (DCC request) to be able to open connection to a NATed client from internet.

Second video demonstrates the effect of the attack on helpers on a non protected Netfilter Firewall.

Third video demonstrates the effect of the attack on helpers on a badly configured Checkpoint firewall.

More information will come in upcoming posts.

Rusty Russell: Oh, BTW, I Am Engaged!

A few of my friends saw the LWN coverage of http://baldalex.org, and sent me a note of congratulations.  This reveals how incredibly slack I am in maintaining connections with my disparate and distributed friends.

Alex and Rusty (2009)

So: I met a wonderful lady, we fell in love, and I proposed on April the 8th last year, at Mt Lofty Gardens overlooking the scenery.  She was speechless, delighted, and said yes!  In related news, one year later, we are set to marry: 5th April 2012, and 3pm at the McLaren Vale Visitor’s centre.  All welcome!  (Yes, that’s Easter Thursday).

So I’ll be offline for April, except briefly to post pictures if we meet the target!

March 02, 2012

Harald Welte: The next project on the horizon: A Free Software CardOS

Now that we have a 100% free software GSM protocol stack and baseband firmware for the network and mobile phone side, the only remaining proprietary part is the SIM card. And what is a SIM card? It's a small embedded computer / SoC with integrated flash + RAM.

Once again, like in many other areas of the telecommunications industry, development of Free Software has been hampered by lack of available register-level hardware documentation. Without such information, how should you be able to program? Hardware without such documentation is an insult to every software developer.

The next problem is that typically, the Card Operating System (COS) is written into mask ROM of the smartcard SoC. Making such a mask is quite expensive, and it means that for every software version, different silicon will have to be produced. So unless you are going to have millions of units in quantity, it is unlikely that it would make economic sense.

However, in recent years, purely flash based smartcard chips have been available and getting less and less expensive. However, none of them (like the Atmel AT90SC7272 or similar devices) have freely available documentation. Furthermore, availability on the open market is somewhat of a problem, mainly because they have been used extensively by people cracking encrypted satellite TV channels. In recent years, the smartcard industry is trying hard to cut any kind of supply to that group of users.

However, luckily, we now see small/independent chip design houses in China picking up and producing their own smartcard chips. They are not only cheaper, but they simply hand out the documentation to anyone who asks them. No questions asked, no NDA required. Welcome to the promised land! That's what Free Software developers like:

  • Free access to documentation without any confidentiality agreements
  • development samples available at the same price as quantity pricing later on
  • inexpensive development hardware with JTAG access
  • reference source code provided without NDA
  • they are happy that somebody wants to develop for their hardware

As you can see, I am quite enthusiastic about this. I like this no-bullshit approach. No stupid marketing and sales droids who charge ridiculous fees for proprietary development tools that are inflexible and force developers to use one particular OS/IDE/toolchain.

I'm not sure how much time there will be, given the multitude of other projects that are all asking for attention. However, I think this is a chance that the Free Software community doesn't get every day. Let's hope some other people like bare iron programming in small embedded systems can get excited and we can create a FOSS COS. It doesn't have to be something serious. Something quite simple would be sufficient for the beginning. I'm not thinking of EAL4+ certification, multiple channels and public key crypto. SIM/USIM cards are simple, they just require a bit of filesystem read/write operations plus authentication. And luckily, SIM toolkit development doesn't have to be done in Java this way, either ;)

Harald Welte: OsmoSDR status update

It has been two months since I first was able to play with the OsmoSDR hardware prototypes. Back at that time, there was no FPGA code yet, and some hardware bugs still had to be resolved. Nonetheless, the e4k tuner driver could already be implemented and tuning was confirmed by looking at the analog i/q spectrum.

Meanwhile, the hardware has been re-worked by SR-Systems and FPGA VHDL code written by maintech.de. Ever since that, they dropped the ball again with me as I had been careless enough to volunteer for writing the firmware.

And that's what I did or at least tried to do for quite some time during the last two weeks. The main problem was that I didn't have much time. The second problem was that I never was able to get the SSC (synchronous serial controller) receive DMA working.

This was a really odd experience, as I've worked a lot with that very same SSC peripheral before, while writing firmware for the OpenPICC some 6 years ago. However, this was in an at91sam7s, where the SSC is interfaced with the PDC (Peripheral DMA Controller). In the at91sam3u of OsmoSDR, it interfaces with a more modern DMAC/HDMA controller, capable of scatter-gather DMA and other fancy stuff.

Atmel has provided reference code that uses the SSC DMA in transmit mode (for a USB audio device playing back music via the Wolfson codec on the SAM3U-EK board). After thoroughly studying the DMAC/HDMA documentation I set out to write code for DMA-based SSC receiver. And it never worked.

I actually wrote two independent implementations, one from scratch and the other based on Atmel reference code. Neither of them worked. It seemed to be a problem with the hardware hand-shaking between SSC and DMAC. The SSC was successfully receiving data, and that data could be read out from the CPU using a polling or IRQ based driver. But if you're running at something like 32 Mbps and don't have a FIFO, you desperately want to use DMA. When the DMA handshaking was turned off, the DMA code worked, but of course it read the same received word several thousand times before the next data arrived on the SSC.

In the end, I was actually convinced it must be a silicon bug. Until I thought well, maybe they just connected the flow controller to a different ID in Rx and Tx direction. Since there are only 16 such identifiers, it was relatively easy to brute-force all of them and see if it worked. And voila - using the identifier 4, it worked!

So what had happened? The Atmel-provided reference code contained a

 #define BOARD_SSC_DMA_HW_SRC_REQ_ID      AT91C_HDMA_SRC_PER_3
and that was wrong. 3 is valid for SSC TX but not for SSC RX. Unfortunately I never found any of those magic numbers in the SAM3U manual either. They are not documented in the chapter of the SSC, and they are not documented in the chapter about HDMA/DMAC either. And they are not identical with the Peripheral Identifiers that are used all over the chip for the built-in peripherals.

In case anyone else is interested, a patch can be found at my at91lib git repository.

I filed a ticket with Atmel support, and they pointed out in fact there was a table with those identifiers somewhere in the early introductory chapters where you can see a brief summary of the features of each integrated peripheral. Unfortunately they use slightly different naming in that chapter and in the DMAC, so a full-text search also didn't find them. Neither is that table visible in the PDF index.

So about four man-days later it was finally working. Another day was spent on integrating it with the USB DMA for sending high-speed isochronous transfers over the bus into the PC. And ever since I'm happily receiving something like 500,000 or 1,000,000 samples / second from an alsa device, using snd-usb-audio. Luckily, unlike MacOS or Windows, the Linux audio drivers don't make arbitrary restrictions in the sample rate. According to the USB Audio spec, the sample rate can be any 24bit number. So audio devices with 16.7 Ms/s are very much within the spec. I hope some of the other OS driver writers would take that to their heart.

One of the first captures can be found at this link, containing a bzip2ed wave file in S16LE format Stereo (I/Q). It contains a FM audio signal transmitted using a small pocket-sized FM transmitter.

There is no I/Q DC offset calibration yet, but once that is done we're probably able to finally put the design into production.

March 01, 2012

Harald Welte: More research into the Motorola Horizon macro and Mo-bis

Once upon a time there was an Americans company called Motorola, and they decided to implement GSM. Unfortunately they decided to deviate significantly from the specification and implement their own proprietary back-haul protocol between BTS and BSC, called Mo-bis. It replaces the standardized A-bis interface.

Today, There are plenty of phased-out Motorola Horizon / Horizon II macro BTSs that have been phased out. Basically you can get them for scrap value, which makes them an ideal target for GSM enthusiasts willing to run a single-cell network with little investment. So while there are actually people who are interested in operating a power-consuming device roughly the size of a washing machine in their home/office - they are normally not interested in running a 19" rack sized Motorola BSC with it. Also, the BSCs are much less frequently to be found compared to the BTS.

So it would be great to support Mo-bis from within OpenBSC. A couple of brave young men have set out to try the seemingly impossible. There's absolutely zero documentation available on that protocol, and no wireshark support either. However, the University of Brno (Czech) has a functional Motorola BTS + BSC setup, and I was able to obtain protocol traces from them and actually experiment with the equipment in their lab.

The entire Motorola GSM architecture seems to be over-engineered without end. Basically you are looking at a distributed computer from the early 1990ies. Lots of processor cards (m68k, ppc) interconnected by HDLC links on top of synchronous 2Mbps links with 64k timeslots. Those links are available e.g. on the backplane of the BTS as a TDM highway. So basically even inside the BTS, the individual processors talk over E1 to each other. In the BSC, there is a token ring based LAN between some of the cards instead. And the MCUF in the BTS even supports to transport those proprietary inter-cpu links via fiber optic (!).

Each processor has a 16bit identifier by which it can be addressed in form of physical addresses. Individual processes on the processors have fixed process identifiers, and they allocate a variety of mailboxes in which they can receive messages from remote processors. There are routing functions at intermediate notes.

So any process on any processor card can send messages to any mailbox of any other process on any other processor, independent of its physical location (locally at the BTS, or at the remote BSC, or even at remote BTSs).

Besides physical addresses, there are also functional addresses. Thos addresses are used particularly to support fail-over. Every board in a BTS and BSC can be fully redundant, and if you use physical addresses, you would address one of the two redundant boards. Using functional addresses, you address the function they both can perform, and some routing magic will make sure it ends up at the current active node in the pair.

There are multiple processors in every TRX, and a couple of processors for each BTS, processors in the E1 line cards, etc. Now speaking of the actual Mo-bis interface: It seems to be a weird mixture between 08.58 (RSL) and 08.08 (BSSAP/BSSMAP). However, after staring at the messages sufficiently long, I have been able to write a more or less complete wireshark dissector for them. Radio Channel Activation (RACH/IMM.ASS) are for example handled directly inside the BTS, they don't exist as transactions on the Mo-bis like they do in A-bis.

So implementing the actual location update / MO+MT voice call and SMS related transactions is actually not all that hard. What makes things really difficult is the way the BTS is initialized at startup. Basically what resembles the OML part of standardized A-bis.

There is a lot of low-level management and bring-up of the individual processes and boards, and the download of a large 500 kByte-sized BLOB simply called database. This binary database contains literally hundreds of configuration parameters for the BTS and its neighbors. It also contains sophisticated configuration of the message routers, the switching/multiplexing of 64k timeslots on the various links, information on redundant paths within the back-haul network, etc.

Interestingly, using the password combination 3beatles and 4stooges on any of the serial consoles of the BTS or BSC, you can enter into a "god-mode" which permits you to enter the executive monitor (EMON). The executive is the operating system they run on both m68k and ppc processors. It provides access to something like a syslog of messages from the various processes, and you can manually generate messages that are to be sent to mailboxes of processes. You can inspect the object table (application programs an databases), read/write to PCMCIA flash cards, read and write to logical and physical memory, inspect CPU and I/O usage and much more. In fact, the integrated Code Object Manager (COM) even allows the processors to synchronize their code versions and remotely boot other CPUS via HDLC channels.

For a communications system geek like myself, it's extremely fascinating to see such a sophisticated and versatile system. I only wonder why on earth somebody would come up with something as complex, only to connect a couple of BTSs to a BSC. Thus, the only logical explanation is that Motorola has developed this distributed proprietary computing system way before they went into GSM, and they probably just recycled it as it already existed.

If anyone knows more about the history of this, I would be excited to hear about it. It literally feels like being an archaeologist. Analyzing ancient technology from our forefathers. But then, it only is 20 odd years old. The only time I had a similar feeling was when I briefly came in touch with IBM mainframes in 2001 and looking at IBMs SNA protocol stack.

February 27, 2012

Rusty Russell: A Plea For Help: Charity

En-haired Fiancée Alex with my daughter Arabella

I’m getting married in just over five weeks!

My fiancée is raising money for charity; if we raise $50,000 by the big day, she will shave her head at the wedding.  Alexandra has had long hair all her life: she’s terrified but determined, so I’m determined to help.

We’re already asking for donations in lieu of wedding presents, but if you’ve ever wanted to buy me a beer for ipchains, iptables, netfilter, module-init-tools, lguest, CCAN, Rusty’s Unreliable Guides, CALU, or any other reason, I’ll take a $100/$20/$5 donation here instead :)

(Compulsory Facebook page here).

February 23, 2012

Eric Leblond: Using AF_PACKET zero copy mode in Suricata

Victor Julien has just pushed a new feature to suricata’s git tree. It brings improvements to the AF_PACKET capture mode.

This capture mode can be used on Linux. It is the native way to capture packet. Suricata is able to use the interesting new multithreading feature provided by AF_PACKET on recent kernels: it is possible to have multiple capture threads receiving the packet of a single interface.

The commits add mmaped ring buffer support to AF_PACKET capture and also provide a zero copy mode. Mmaped ring buffer is mechanism similar to the one used by PF_RING. The kernel allocates some memory to store the packets and share this memory with the capture process. Instead of sending messages, the kernel just write to the shared memory and the process capture reads it. This is less consuming in term of CPU ressource and helps to increase the capture rate. But the main avantage of this technique is that the capture process can treat the packets without making a copy and this saves a lot of time

To activate this features, you need a Suricata compiled from latest git and you need to modify some entries in your suricata.yaml file. You have to tell suricata that you want to activate the mmap feature. For example to activate the feature on eth0, you have to add ‘use-mmap’ to your configuration:

af-packet:
  - interface: eth0
    use-mmap: yes

You can then run Suricata with the command:

suricata -c suricata.yaml --af-packet=eth0

This setup will not activate the zero copy feature which is currently dependant of the running mode. You will need to activate the worker mode to enable zero copy. To do so, run Suricata with a command similar to this one:

suricata -c suricata.yaml --af-packet=eth0 --runmode=worker

This code should provide an interesting performance boost to the AF_PACKET capture system. I’ve no number to provide now but I will be happy to hear some if you make some tests

February 13, 2012

Eric Leblond: Ecosystem of Suricata

Suricata is an IDS/IPS engine. To build a complete solution, you will need to use other tools. The following schema is a representation of a possible software setup in the case Suricata is used as IDS or IPS on the network. It only uses opensource components:

Suricata is used to sniff and analyse the traffic. To detect malicious traffic, it uses signatures (or rules). You can download a set of specialised rules from EmergingThreats.

To analyse the alerts generated by Suricata, you will need an interface such as, for example, snorby.

But this interface will need to have an access to the data. This is done by storing the alerts in a database. The feed of this database can be done by barnyard2 which take the files outputted by suricata in unified2 and convert and send them to a database (or to some other format).

If you want counter-measure to be taken (such as blocking for a moment attacker at the firewall level), you can use snortsam.

More complex setup can include interactions with a SIEM solution.

February 09, 2012

Harald Welte: Some comments on the heated debate on SFC / Busybox / Linux GPL enforcement

During the past week[s], there has been a heated debate on the alleged methods of GPL enforcement as it is performed by the Software Freedom Conservancy on behalf of the Busybox copyright holders.

The extent of license enforcement on Busybox has apparently triggered the proposal to create a non-GPL replacement for it, which in turn has received quite harsh responses e.g. from Matthew Garrett.

It's been relatively difficult for me to figure out what is really going on here. It is well-known that the Free Software Conservancy has been actively enforcing the GPL on Busybox. But then, at the same time gpl-violations.org has been (and still is!) similarly active in enforcing the GPL on the Linux kernel. Still, I haven't yet seen calls to write a non-GPL Linux kernel replacement. Of course, the complexity is on an entirely different scale, so this point is moot.

However, for quite some time there have been rumors about the intensity (some would say aggressiveness) of the enforcement. I don't want to accuse anybody of anything, so I'm going to write speculatively about it.

This post is to summarize my thoughts on all of this:

  • It is well within the right of each author / copyright holder to decide on the enforcement strategy and license interpretation. As such, I respect the decision of the authors. It is their work, they should decide what to do.

  • In any kind of GPL enforcement, you of course not only want the complete corresponding source code to one program, but to all of the GPL/LGPL/AGPL or otherwise copyleft licensed programs contained in the product. We at gpl-violations.org have always been requesting the complete corresponding source code to all GPL licensed software during our communication with the infringing companies. This request was typically honored by everyone, without the need to apply any pressure onto it. After all, releasing only one bit of code causes the risk to get sued by somebody else who owns the other not-yet-compliant part of the code.

    Now there have been rumors that SFC was not only requesting non-Busybox source code, but also making it a condition for the explicit re-instatement of the license on Busybox. Whether or not there was such a hard condition is subject to debate and there are different opinions on it. For those in the field of FOSS licensing, it has always known that there are different lines of thought with regard to the requirement to explicit reinstatement. We in Germany generally think that it is not required at all, and the existing preliminary injunctions at least implicitly acknowledge that as they enjoin companies from distributing a product as long as it is not in compliance with the license. In other (particularly the U.S.), it is generally assumed that explicit reinstatement is required. In such a case, it may very well be legally possible to use it as a lever to obtain source code for other programs like the Linux kernel. However, I am personally not sure if that really is the right strategy. Not everything that is possible legally is ethically the right thing to do. But then, ethics and legal customs differ widely in the FOSS communities, as they do in society in general. Some countries and communities believe in the death penalty, others don't. Some countries allow abortion, others don't. Some allow prostitution, others don't. So when judging about whether that "reinstatement lever" is acceptable or not, we have to accept that there may be different lines of thought. I for my part definitely think that the far superior method is, beyond doubt, to have a rights holder on those other program in order to make any demand for source code (as opposed to a mere request without implicit or explicit legal threat).

  • There also have been rumors about a requirement on submitting future source code releases to a compliance audit by the Conservancy. According to SFC sources, there never was any such demand, and the rumors are likely spawned by some incorrect claims of a defendant in a court case, which ended up in the public record. If there was such a requirement, I wouldn't think it is just - at least not for a first-time non-intentional infringement case. If there was repeated infringement and a clear sign that it would happen again and again, such a requirement for future audits may be justified, depending on the case.

  • People who claim that GPL enforcement is scaring away companies from using Linux and/or other Free Software also have to be careful in what they say. If a commercial entity enters a new market (let's say Android Tablets), then there is a certain due diligence required before entering that market. So if you don't understand Free Software and particularly GPL licensing, then you shouldn't place a Linux-based device on the market. Just think about an analogy: If you have a recycling company and enter a new market (disposal of hazardous chemicals), then you cannot simply treat those chemicals as regular waste, wait until you run into legal trouble and expect to get away with it.

    I think there are still far too many GPL violations out there, and we need to see more enforcement in order to get all the major players in their respective lines of business into compliance. But come on, dealing with embedded devices in 2012 and still getting compliance outright wrong really means that there has not been the least bit of attention on this subject. And without enforcement, it is never going to change. People who want no enforcement should simply use MIT-style licenses.

    Last, but not least, I also think GPL compliance is a matter of fair competition. There are some companies who really do a good job in ensuring compliance with the various Free Software licenses. If their competition doesn't invest the funds into the respective skills, procedures and business processes, they are getting an unfair competitive advantage against those who are doing it right. If there was no enforcement, the motivation would be to reduce efforts in compliance, not increase it.

Let me conclude with a clear statement to anyone who thinks that by replacing Busybox with a non-GPL licensed project they can evade GPL enforcement: It will not work. There are others out there enforcing the GPL. Last but not least gpl-violations.org. Despite the notoriously outdated webpage, we are still alive and kicking, churning down on the violation reports that we receive. Armijn Hemel, Joachim Steiger, Tim Engelhardt, Julia Gebert and Till Jaeger deserve much of the credit for all that work, while I'm mostly spending each awake minute hacking Free Software for mobile communications. Yes, we should publish more about our activities, and I hope to find the time to do so. There should at least be an annual report with the number of cases...

January 28, 2012

Rusty Russell: Why Everyone Must Oppose The Merging of /usr and /

As co-editor of the last edition of the File Hierarchy Standard before it merged into the Linux Standard Base, I’ve been following the discussion about combining the directories  /bin, /sbin and /lib into /usr/bin, /usr/sbin and /usr/lib respectively.  You can follow it too, via the LWN discussion.

To summarize, there are two sides to the debate.  The “pro” side points out:

  1. Nothing will really change for users, as symlinks will make old stuff still work.
  2. There are precedents in Solaris and Fedora.
  3. The weak reasonings used previously to separate / and /usr no longer apply.
  4. Separate /usr has become increasingly unsupported anyway.
  5. Moving to /usr will enable genuine R/O root filesystem sharing.

The “anti” side, however, raises very salient points:

  1. Lennart Poettering supports it.
  2. Lennart Poettering is an asshole.

Fellow Anti-mergers, I understand the pain and anguish that systemd has caused you personally, and your families.  Your hopes and dreams crushed, by someone with all the charm of a cheese grater across the knuckles.  Your remaining life tainted by this putrescent subhuman who forced himself upon your internet.

Despite the privation we have all endured, please find strength to stop this nightmarish ravaging of our once-pure filesystems.  For if he’s not stopped now, what hope for  /usr/sbin vs /usr/bin?

Harald Welte: New OsmocomBB RSSI monitor firmware

Jolly has been hacking up a nice new RSSI monitoring firmware application for OsmocomBB.

I let the pictures speak for themselves:

I really hope this trend continues and we'll get some actual user interface in OsmocomBB at some point this year..

January 25, 2012

Harald Welte: OP25 project joins hosting on osmocom.org

Some days ago, I noticed that the famous OP25 project (a Free Software implementation of the APCO25 system, a digital trunked radio system) was no longer reachable on-line. It seems they were running this on a desktop PC in a university. As nobody in the project still seems to be at that university, a change in the network configuration had accidentally rendered the website unreachable.

After some quick e-mails, I offered to host them within the osmocom.org family of Free Software Projects for mobile communications. This is when op25.osmocom.org was created, and a full-site backup uploaded + installed.

I'm really happy that we were able to do a small part to help to make sure this valuable project remains accessible to interested parties in the signal processing and mobile communications field.

Harald Welte: First osmo-nvs-gps evaluation boards soldered

At the osmocom project, we recently discovered the most interesting NVS NV08C-CSM module. It not only is a superb GPS receiver, but it includes GALILEO and GLONASS receivers, too. However, it's only available as an industry module, or as an expensive (700 EUR or so) evaluation kit.

Given the cheap PCB prototyping service at seeedstudio, I thought I'd spend an afternoon creating the schematics and PCB layout for an evaluation board. It exports the two 3.3V UARTs on OsmocomBB-style 2.5mm jacks, so they can be used with the T191 cables. I have the feeling this 2.5mm jack is becoming a new standard for low-voltage RS232 links ;)

Furthermore, it exports the SPI, I/O and I2C on a 20pin 2.54mm pitch header, connects to an external antenna via a MCX socket and has an optional footprint for a CR2032 battery on the bottom side.

So far, the board seems to be working fine. If there is interest in the bare PCB itself (without components!), please send me an e-mail. Depending on the amount of interest we might add it to the sysmocom webshop.

Schematics and Gerber files will be available at http://openbsc.osmocom.org/trac/wiki/osmo-nvs-gps soon.

January 17, 2012

Harald Welte: Having Fun with DHL Express!

This is what I got when tracking one of my inbound shipments:

It seems DHL is having fun bouncing the package back and forward between Hong Kong and Leipzig(Germany). So far, it started in HK, then arrived in Leipzig on January 8, went back to HK, back to Leipzig, back to HK, back to Leipzig and is currently allegedly again in Hong Kong _after_ succesfully passing German customs clearance on January 15.

For the TCP/IP nerds among the readers: I wonder when the TTL expires.

January 14, 2012

Harald Welte: First assembled prototypes of osmo-e1-xcvr

I mentioned it briefly before: I've designed a small E1/T1/J1 transceiver board, which is going to be used for experimentally interfacing such a TDM line with microcontroller and/or FPGA. The name of this board is osmo-e1-xcvr.

The first prototype PCBs have arrived yesterday, and despite lots of other more important work I couldn't resist but to actually solder some of the units. The result can be seen here:

I don't have time to do anything beyond very basic testing right now, but so far the boards seem to be doing fine. Now we need a driver for the transceiver chip, and connect its control interface over SPI to some microcontroller (likely sam7s/sam3s/sam3u in my case). The actual serial bitstream will end up at the SSC peripheral of the controller.

January 10, 2012

Harald Welte: OsmoSDR status update

It's already two weeks since my last post mentioning OsmoSDR only very briefly. By now, OsmoSDR is fully public and you can read all about it on the http://sdr.osmocom.org/ website.

Specifically, the website includes Schematics, and one of my colorful block diagrams. I am a text guy and really hate to work with graphics, but the block diagram of the Calypso has helped a lot in the context of making people understand OsmocomBB, so I tried again for OsmoSDR:

So as you can see, it is a very simple, straight-forward design with a chip tuner, direct I/Q down-conversion, ADC for differential I and Q signals as well as a small FPGA for basic filtering and serial format conversion, followed by a Atmel SAM3U microcontroller (Cortex-M3, high speed USB).

And yes, it's receive only. However, it has a stacking connector for later addition of a transmit daughter-board which may eventually follow later in 2012.

So what is this OsmoSDR going to be used for? Well, for receiving any kind of signals between 64 MHz and 1700 MHz (possibly up to 1800 MHz, untested). We don't build it for a specific application, but we simply thought that for many applications a member of the USRP family is expensive overkill, and the FCDP has this arbitrary restriction at 96 kHz sampling frequency.

Please note that while I may be the only OsmoSDR developer that blogs about this project, it is very much a team effort and I'm only a minor part of that team. Apart from selecting the SAM3U, writing firmware and drivers for it as well as some discussions early on in the project, I didn't have any involvement in the Hardware design. Those credits go to Christian Daniel and Stefan Reimann.

So where do we stand? There are 5 prototypes of the first generation, they look like this:

There are some smaller hardware issues that were easy to work around, but one bigger problem related to the fact that the Si570 programmable oscillator output levels didn't match quite with what the FPGA clock input requires.

There will be a second generation board fixing this and other problems, and hopefully we'll see some progress in the weeks to come.

Firmware-wise, the code is currently scattered over a couple of different repositories, but I'm going to consolidate that soon. If you've worked with OpenPCD or SIMtrace, you will find some similarities: We split the flash of the controller into two partitions: One for the DFU bootloader, and one for the application code. You can then use the USB DFU (Device Firmware Upgrade) standard to quickly update the actual firmware of the device, without having to resort to jumpers or un-plugging and re-plugging of the hardware.

This also meant that I had to re-implement the old sam7dfu code for the SAM3U, which has considerable differences in the USB peripheral, cpu core, board startup code, etc. So it's really a reimplementation than a port. The good news is that I tried to make it as general as possible and integrate it with at91lib, Atmels reference code. So it should be easier to use it with other Atmel SoCs like the sam3s which I want to use e.g. in SIMtrace v2.

December 24, 2011

Harald Welte: HTCs delays in releasing Linux source code are unacceptable

The Taiwanese smart phone maker HTC is widely known to be delaying its Linux kernel source code releases of their Android products. Initially, this has been described to to the requirement for source code review, and making sure that no proprietary portions are ending up in the release.

While the point is sort-of moot from the beginning (there should be no proprietary portions inside the Linux kernel for a product that wants to avoid entering any legal grey zone in the first place), I was willing to accept/tolerate it for some time.

At one point more than one year ago, gpl-violations.org actually had the opportunity to speak in person to senior HTC staff about this. I made it very clear that this delay is not acceptable, and that they should quickly fix their processes in order to make sure they reduce that delay, eventually down to zero.

Recently, I received news that the opposite is happening. HTC still has the same delays, and they are now actually claiming that even a 120 days delay is in compliance with the license.

I do think neither the paying HTC customers, nor tha Free Software community as a whole have to tolerate those delays. It is true that the GPLv2 doesn't list a deadline until when the source code has to be provided, but it is at the same also very clear what the license wants: To enable people to study the program source code. Especially in todays rapid smart phone product cycles, 120 days is a very long time.

So I hereby declare my patience has ended here. I am determined to bring those outrageous delays to an end. This will be one of my new year resolutions for 2012: Use whatever means possible to make HTC understand that this is not how you can treat Free Software, the community, its customers, the GPL and in the end, copyright itself.

December 23, 2011

Harald Welte: More "bare iron" development: OsmoSDR, osmo-e1-xcvr and SIM bank

I'm currently quite excited to be doing more bare iron programming as well as actual electrical engineering work again.

There's actually not just one project I'm working on, but a variety of them:

  • OsmoSDR - an upcoming small form-factor, inexpensive USB SDR. Not big and expensive like USRP or real "professional" solutions, but also not as weak as the ultra-narrow-band funcube dongle. I wasn't involved in the hardware design, but have volunteered to take care of the firmware development for the Atmel SAM3U micro-controller inside.
  • osmo-e1-xcvr - a relatively simple circuit board containing the magnetics, LIU, clock generation and transceiver for E1/T1/J1 lines. The idea here is to have a solution for the analog part, as well as HDB3/B8ZS/AMI encoding, which can then be attached to any FPGA, CPLD or micro-controller development board. It exposes SPI for controlling the transceiver, and a synchronous serial bit-stream for Rx and Tx.
  • unnamed sim-bank project - here the goal is to find a cheap solution to attach a large number of SIM cards to either a PC or directly to Ethernet. This can be very useful for testing, where you host your sim cards in a centralized location and can borrow them to remote users/devices over TCP/IP. There are commercial devices available for this, but they are quite expensive (like 1700 USD for 32 card device) and intended to be used with some proprietary windows software (who wants that?!?).

In the latter two projects I'm also doing the component selection, schematics design and PCB layout. One project with KiCAD, the other in EAGLE, as I really want to get first-hand experience of the usability of Free vs. proprietary EDA tools. I'd love to also evaluate Altium Designer, but they are still windows-only, and that would just make life way too difficult for me.

The projects will be duly announced soon, and they are all intended to be Open Hardware designs with Free Software. We'll probably also make all of them available at shop.sysmocom.de, too.

December 01, 2011

Eric Leblond: À propos de la publication de code d’EdenWall

J’ai cofondé la société INL en 2004. Renommée en 2009 EdenWall, suite à une levée de fonds et un changement de métier, le nouveau business model de la société fut la commercialisation d’appliances de sécurité basées sur le logiciel libre NuFW que j’avais initié en 2003. NuFW, couche logicielle ajoutant l’authentification des flux à Netfilter, est resté le moteur technologique de la société mais n’était pas d’un accès facile car nécessitant des compétences bas niveaux pour son déploiement. Nous avons donc distribué sous licence libre des briques complémentaires à partir de 2005. Nulog, projet d’analyse de journaux, que j’avais commencé en 2001 et Nuface, interface de configuration de politiques de filtrage en 2005. La conclusion de cette démarche d’ouverture a été NuFirewall, une solution autonome de pare-feu basée sur les briques EdenWall qui a été distribuée en 2010. Il s’agissait d’une version libre des appliances EdenWall distribuée sous forme d’une distribution indépendante publiée sous licence GPL. L’idée des fondateurs était d’avoir une structure de produits similaires à une offre comme celle de VirtualBox avec une distribution sous double licence : une solution libre convenant au plus grand nombre et une version avec des fonctionnalités Entreprise.

Suite à un enchainemenent d’événements malheureux, la société EdenWall a fait faillite en courant d’année 2011. Les différents logiciels libres se sont alors retrouvés orphelins. La FSF en relation avec une partie des anciens développeurs de la société a décidé de coordonner les efforts pour la reprise des différents logiciels libres. Je m’étais battu pour la publication de ses briques et solutions et je me suis donc montré très enthousiaste quant à l’idée de remonter un site hébergeant ce qui avait été publié sous licence libre. C’est ce que j’exprime dans la citation reprise dans le communiqué de presse de la FSF.

Cc’est toutefois la majeure partie du code de la société EdenWall qui a été publié. Cette publication est peut-être valable au point de vue juridique comme expliqué dans le communiqué de la FSF mais elle ne reflète en rien mon positionnement moral personnel. Je ne cautionne donc pas cette publication globale mais uniquement la reprise des projets NuFW et NuFirewall et de manière plus globale du code qui avait déjà été publié sous licence libre par EdenWall.

November 30, 2011

Eric Leblond: Securing Netfilter connection tracking helpers

Following the presentation I’ve made during the 8th Netfilter Workshop, it was decided to write a document containing the best practices for a secure use of iptables and connection tracking helpers. This document called “Secure use of iptables and connection tracking helpers” is now available on this site. It contains recommendations that should be followed carefully if you are the administrator of a Netfilter/Iptables or the developer of a Netfilter based software.

November 28, 2011

Harald Welte: Back home after successful KOSS Legal Conference

The first incarnation of the KOSS Legal Conference was a big success. There were many participants from a variety of backgrounds, such as

  • Independent Korean legal experts
  • Legal scholars from Korean law schools
  • International legal experts (e.g. Till Jaeger, Carlo Piana, etc.)
  • Representatives from the major Korean IT industry
  • Representatives of the community organizations like FSFE
  • Independent technical experts like Armijn Hemel and myself

The discussions have been a big success, with significant participation from the floor. There are many events that I attended where it was hard to actually get any participation from the audience - but the KOSS Law conference was definitely not one of them. Some of the questions were easy to respond to, some other questions really tackled the difficult issues in Free Software License Compliance.

What was clear to see from the Industry participants: FOSS License Compliance has become an important topic in the last couple of years: One the one hand as a result of virtually no TV set / mobile phone / PMP or other device running without Linux or other FOSS. On the other hand, I'm sure that the enforcement efforts of gpl-violations.org and the SFLC also have had significant impact on that.

What I personally find important is that compliance is only considered as part of the overall FOSS picture. Complying with the license text is the minimum that companies involved with FOSS should do. Rather, they should look beyond mere compliance and consider the benefit of engaging more actively with the community, contribute code back upstream/mainline and really becoming a first-class citizen of the Free Software world.

As a big surprise to everyone, Jim Zemlin of the Linux Foundation made a surprise visit towards the end of the second day of the conference.

Many thanks to the KOSS Law center for bringing this together and organizing such an event. Thanks also to the Korean NIPA (National IT Industry Promotion Agency) and the FSFE for their support of the event.

November 21, 2011

Rusty Russell: The Power of Undefined Values

Tools shape the way we work, because they change where we perceive risk when we write code.  If common compilers warn about something, I’ll code in a way that will trigger it in case of mistakes.  eg: instead of:

    int err = -EINVAL;
    if (something())
         goto out;
    err = -ENOSPC;
    if (something_else())
         goto cleanup_something;
...
cleanup_something:
    undo_something();
out:
    return err;

I would now set err in every branch:

    int err;
    if (something()) {
        err = -EINVAL;
        goto out;
    }
    if (something_else()) {
        err = -ENOSPC;
        goto out;
    }

Because when I add another clause to the initialization and forget to set err, gcc will warn me about it being uninitialized.  This bit me once, and it can be hard to spot the problem when you’re only reviewing a patch, not the code as a whole.

These days, we have valgrind, and despite its fame as a use-after-free debugger, it really shines at telling you when you rely on the results of an uninitialized field.  So, I’ve adapted to lean on it.  I explicitly don’t initialize structure members I don’t use in a certain path.  I avoid calloc(): while 0 is often less harmful than any other value, I’d much rather know that I’ve thought about and set up every field I actually use.  When changing code this is particularly important, and I spend a lot of my time changing code.  I have even changed to doing malloc() in some cases where I previously used on-stack or file-scope variables.  Valgrind doesn’t track on-stack usage very well, and static variables are defined to be zeroed, so valgrind can’t tell when I wander into the weeds.  I think these days, that’s a misfeature.

So, if I were designing a C-like language today, I’d bake in the concept of undefined values, knowing that the tools to leverage it are widely available.  10 years ago, I’d have said 0-by-default is safest, but times change.  I think Go chose wrong here, but it may not be as bad as C for other reasons.  I’d have to code in it for a few years to really tell.

November 08, 2011

Harald Welte: Going to attend Korean FOSS legal conference

Recently I had been invited by the Korean Open Source Software (KOSS) Law Center to attend their 2011 KOSS conference scheduled for November 17 and 18 in Seoul, Korea.

This conference is organized by the KOSS Law Center with support by the Korean Government (National IT Industry Promotion Agency). Its primary purpose is to share best practises in terms of FOSS licensing, license compliance but also FOSS community interaction within the Korean IT industry and the public sector.

I'm happy to present on Beyond Legal Compliance - Embracing the FOSS community, where I will outline that the primary focus should not be on to-the-letter legal compliance, but to a proactive way of interacting with the FOSS community. After all, collaborative development is what FOSS is all about...

However, due to a schedule conflict with the DeepSec 2011 conference in Vienna (where I'm giving a two-day GSM security workshop), I'm only able to attend the second day of the KOSS conference.

The speaker line-up for the KOSS conference is quite impressive, and it includes Karsten Gerloff (FSFE), Till Jaeger (JBB), Carlo Piana (FSFE), Keith Bergelt (OIN), Armijn Hemel (gpl-violations.org/Tjaldur) and others.

Unfortunately there seems to be no homepage, at least none with an English language title that Google would be able to find. Carlo Piana has mentioned the event in his blog four days ago.

UPDATE: There now is a conference page, although in Korean language only ;)

November 07, 2011

Eric Leblond: What’s new in coccigrep 1.6?

I did not write any article on coccigrep since the 1.0 release. Here is an update on what has been added to the software since that release.

C++ support

Coccinelle has a basic C++ support which can be activated by using the –cpp flag in coccigrep.

Patches information

The -L -v options on command line will display a description of the match available on the system.
$ coccigrep -L -v
set: Search where a given attribute of structure 'type' is set
 * Confidence: 80%
 * Author: Eric Leblond 
 * Arguments: type, attribute
 * Revision: 2
For the developer, this is obtained from structured comments put at the start of the cocci file:
$ head src/data/set.cocci 
// Author: Eric Leblond 
// Desc: Search where a given attribute of structure 'type' is set
// Confidence: 80%
// Arguments: type, attribute
// Revision: 2
@init@
This is thus an easy way to document the search operation. Please note, that this will also work for the operations put in the user or system custom directory.

Context line display improvement

Guillaume Nault has contributed a series of patches that greatly improved the display of context lines:
$ coccigrep -C 3 -t Packet -a flags -o set decode*c 
decode.c-90            -     }
decode.c-91            - 
decode.c-92            -     PACKET_INITIALIZE(p);
decode.c:93 (Packet *p):     p->flags |= PKT_ALLOC;
decode.c-94            - 
decode.c-95            -     SCLogDebug("allocated a new packet only using alloc...");
decode.c-96            -

Documentation

A man page is now available.

Python 2.5 support

The 1.6 release came with a code modification that permit coccigrep to run with python 2.5. Some users seem to still use this old version of Python and the support was not requiring to degrade coccigrep code. It has even improved it.

Option to read file lists from a file

Thomas Graf has contributed the -l option which provides a way to specify a file containing the list of the files to search in.

Operation improvement

The set operation has been improved and is now more accurate thanks to the support of all related operators.

Conclusion

Coccigrep is becoming more and more mature over time. The existing code base remains and a polishing work is currently under progress. One last point on the project is that some Linux and *BSD distribution seems to have done packages. This is the case of Aur, Gentoo, NetBSD, OpenBSD, Mandriva and soon Debian if the intention to package is confirmed.

October 06, 2011

Eric Leblond: Acquisition systems and running modes evolution of Suricata

Some new features have recently reach Suricata’s git tree and will be available in the next development release. I’ve worked on some of them that I will describe here.

Multi interfaces support and new running modes

Configuration update

IDS live mode in suricata (pcap, pf_ring, af_packet) now supports the capture on multiple interfaces. The syntax of the YAML configuration file has evolved and it is now possible to set per-interface variables. For example, it is possible to define pfring configuration with the following syntax:
pfring:
  - interface: eth4
    threads: 8
    cluster-id: 99
    cluster-type: cluster_flow
  - interface: eth1
    threads: 2
    cluster-id: 98
    cluster-type: cluster_round_robin
This set different parameters for the eth4 and eth2 interfaces. With that configuration, it the user launches suricata with
suricata -c suricata.yaml --pfring
it will be listening on eth4 and with 8 threads receiving packets on eth4 with a flow based load balancing and 2 threads on eth3. If you want to run suricata on a single interface, simply do:
suricata -c suricata.yaml --pfring=eth4
This syntax can be used with the new AF_PACKET acquisition module describe below.

New running modes

The running modes have been extended by a new running mode available for pfring and af_packet which is called workers. This mode starts a configurable number of threads which are doing all the treatment from packet acquisition to logging.

List of running modes

Here is the list of current running modes:
  • auto: Multi threaded mode (available for all packet acquisition modules)
  • single: Single threaded mode (available in pcap, pcap file, pfring, af_packet)
  • workers: Workers mode (available in AF_PACKET and pfring)
  • autofp: Multi threaded mode. Packets from each flow are assigned to a single detect thread.

af_packet support

Suricata now supports acquisition via AF_PACKET. This linux packet acquisition socket has recently evolved and it supports now load balancing of the capture of an interface between userspace sockets. This module can be configured like show at the start of this post. It will run on almost any Linux but you will need a 3.0 kernel to be able to use the load balancing features.
suricata -c suricata.yaml --af-packet=eth4

September 22, 2011

Rusty Russell: PLUG: Coding: let’s have fun!

The Perth Linux User Group are flying me across next month to speak, complete with on-couch accommodation! But since I can’t find an easily-linkable synopsis of my talk, here are the details:

It’s hard to describe the joy of coding, if you haven’t experienced it,  but in this talk will try to capture some of it.  Free/Open Source lets us remove the cruft which forecloses on the joy of coding: seize this chance!

I’ll talk about some of my favourite projects over the last 15 years of  Free Software coding: what I did, how much fun it was and some surprising results which came from it.  I’ll also discuss some hard lessons learned about joyless coding, so you can avoid it.  There will also be a sneak peak from my upcoming linux.conf.au talk.

There’ll be some awesome code to delight us.  And if you’re not a programmer we’ll take you our journey and show you the moments of  brilliance which keep us coding.

September 20, 2011

Rusty Russell: linux.conf.au: Hacking your badge for lca2012

Someone mentioned that you had to look at the source code if you wanted to hack your badge this year; I would have considered that cheating if I hadn’t known.  (It’s been a few years since I last hacked my badge).  But it helps if you look in the right place: http://lca2012.blogspot.com/2011/09/feeling-silly.html

Thanks to Tony Breeds for pointing me at that after I’d given up with the github upstream source…

September 19, 2011

Eric Leblond: OISF brainstorming: planning phase 3 (take 3)

GEO IP

Idea is to add a keyword that would be used to interact with GEOIP database (free at least) and be able to use it to detect things like control canal. For example, an IRC server in an non common country is certainly a control canal.

Live ruleset swap

A must have! This is vital for critical environnement. This is very costly in memory and this should be an option to avoid exploding low memory boxes.

Qosmos integration / API for data exchange

Bringing protocol analysis is an interesting point as it will help to increase performance and accuracy of the engine. Knowing the protocol permit to only run protocol related rules to flow of that protocol. And this avoid to have false detection by running the rules on bad protocol. OpenDPI technology and Qosmos technology integration is discussed. A common API is needed to be able to use both systems.

Global shared flowvars

Global flow var will permit to change the way we build rules. Not being constrained anymore to stream variable will increase the power of rules.

Host/app/OS table import

Idea is to load host type from file to be able to tune the host setting precisely.

IPFIX support

IPFIX support as entry or output could bring some advantages.

Conclusion

Matt Jonkman and Victor Julien will now summarize the input and publish on OISF website the planned features for phase 3 based on discussion about priority of the tasks that have been held.

Eric Leblond: OISF brainstorming: planning phase 3 (take 2)

DNS fast flux/anomaly detection

The idea is to detect malware and other things by collecting the DNS request and their answer and detecting anomaly. For example, if an host is making a lot of request to a domain. First part of the job on Suricata is to log all requests and their answer. Then analysis can occurs in the database.

File extraction

This is a work under progress linked with a third party contract. It permit to store exchanged files on disk for some application level protocol. It is possible to say: “store the file, if the content type is different from the extension”. File extraction works currently on HTTP. It focus on POST request to detect uploaded file.

Time-based counters

This aims at detection of regular behaviours which are often linked with command control connection. For example, triggering an alert if a specific DNS request is done every five minutes.

HTTP keyword improvement

HTTP keywords improvement are discussed, some specific keywords could be added to avoid the cost of pcre. A two parameters header match is suggested to support all possible keyword.

Output

Discussion is about logging the normalised application level content. Currently, only the packet triggering the lart is loggued and thus the information about why suricata has logged is lost. This is thus interesting to log the reconstructed application level message to permit the analyst to analyse the reason of the alert. Regarding the output module, it could be interesting to adds support for CEE logging which would be able to support the resulting composite alert. For performance reason, a barnyard output of this composite alert is interesting. It may be needed to suggest some change to support this application level alert.

Eric Leblond: Oisf brainstorming: planning phase 3 (take 1)

Performance improvement

As shown by Victor’s latest work on performance counters, there is a lot of work that can be done to improve performance. They are currently good but there is place for improvement. Proposal to provide off-loading or clustering is done. This is heavily discussed but as pointed out by Victor, it will be more interesting to do this in the next phase. Phase 3 should focus in improvement of current code. This will permit to use the upcoming Suricata killing features like global flow variable.

SSL preprocessor

Following the recent certicate authority attacks, a SSL preprocessor which is able to detect blacklist certificate and other things will be really interesting. It could also detect certificate property change whan connection to host or similar things. Decryption is not seen by the participants of the webex has important. Decryption would be performance killer without accelerator. Thus a limitation to certificate analysis is at first an interesting target

IP and DNS reputation

The idea is too exchange IP reputation between sensors to protect themself against offensive hosts as soon as it has been identified on one sensor. This requires exchange between nodes and this is already done by other projects. Doing it alone in Suricata would be to big and thinking about adding a way to interact easily with this project could be a first step.

Eric Leblond: Matt Jonkman: development avancement

Phase 2 development is almost over now. Among the completed major features:
  • Multithread
  • protocol discovery
  • smb logging
  • HTTP logging
  • flowvars
One of the advantage of Suricata over Snort is protocol discovery combined to HTTP parsing by libhtp. It provides a huge improvement over Snort as a lot of bad flow are using HTTP on non standard ports.

Eric Leblond: Victor Julien: Development status

Work has started in september 2007. The work depends on some externel library like multithread of input handling library. The main external depedency is libhtp which is initally developped by Ivan Ristic. The development is managed in a single git repository. Victor is the only one with commit right. The review are done by Victor and cross review are made by developpers. Work unit for developers are tasks which are written by Victor and describe a specific task to do. This task are mainly done by OISF funded developers. Some simpler task are let to the comunity and everyone can help with this. To offload Victor’s load, subsystem mainteners are nominated:
  • Eric Leblond: packet acquisition
  • Anoop Saldanha: detection part
They will have freedom on the way to improve the subsystem they are in charge. The development is currently done with two branches (1.0 which is bug fix only and master). Victor is currently not happy with this and would like to switch to time-based release. This is discussed as this can be difficult for company to deal with frequent updates. A funding of maintenance by companies could help to keep the current working system. Peter Manev is in charge of the QA. There is a lot of work to do in this area. Unit test is currently good but there is a lot of work to do to improve detection of regression. Performance has been improved in 1.1 with a focus on efficient algorithm. CUDA support needs help. The performance is still lower with than without and thus this really need developer power! Performance profiling have been added recently by Victor and it shows clearly that work is needed on this. To sum up, the two main areas where help will be most than welcome are QA and performance profiling.

Eric Leblond: Matt Jonkman: introduction speech

Matt presents the goal of the OISF brainstorming session:
  • Make a status of the foundation
  • Grabbing new ideas
The session will be interactive and anybody is invited to participate through physical intendance or webex. The foundation is non-profitable and aim at building a powerful engine for us all. OISF is member og the HOST program and happily supported by some industrials.

Foundation business

Matt fills he can not give enough times to the foundation due to his work at EmergingThreat and propose to hire a General Manager that would take care of finding the funding and administrative part. The funding renewal was smallest than planned and the work on getting funding must be increased. Situation is not bad but improvement could be interesting.

Potential major projects

Universal Rules Language

This is a highly difficult problem. It has been wanted by major actors for years but no successfull and clean approach has been found.

Commercial support

A lot of industrials want to have access to commercial support to integrate Suricata in their product. Developers can not do the job but they can form some support people to have them become experts.

OpenDPI/Qosmos integration

The idea is to add support for multiple protocols to be able to provide high level rules for a lot more of protocols.

Scada

Digital bound give a copyright usage to the foundation to integrade a preprocessor into Suricata.

Snortsam integration

Snortsam is now a plugin to barnyard and this made it more easy to use.

August 25, 2011

Rusty Russell: Speeding CCAN Testing (By Not Optimizing)

So, ccanlint has accreted into a vital tool for me when writing standalone bits of code; it does various sanity checks (licensing, documentation, dependencies) and then runs the tests, offers to run the first failure under gdb, etc.   With the TDB2 work, I just folded in the whole TDB1 code and hence its testsuite, which made it blow out from 46 to 71 tests.  At this point, ccanlint takes over ten minutes!

This is for two reasons: firstly because ccanlint runs everything serially, and secondly because ccanlint runs each test four times: once to see if it passes, once to get coverage, once under valgrind, and once with all the platform features it tests turned off (eg. HAVE_MMAP).  I balked at running the reduced-feature variant under valgrind, though ideally I’d do that too.

Before going parallel, I thought I should cut down the compile/run cycles.  A bit of measurement gives some interesting results (on the initial TDB2 with 46 tests):

  1. Compiling the tests takes 24 seconds.
  2. Running the tests takes 12 seconds.
  3. Compiling the tests with coverage support takes 32 seconds.
  4. Running the tests with coverage support takes 32 seconds.
  5. Running the tests under valgrind takes 204 seconds (17x slowdown)
  6. Running the tests with coverage under valgrind takes 326 seconds.

It’s no surprise that valgrind is the slowest step, but I was surprised that compiling is slower than running the tests.  This is because CCAN “run” tests actually #include the entire module source so they can do invasive testing.

So the simple approach of compiling up once, with -fprofile-arcs -ftest-coverage, and running that under valgrind to get everything in one go is much slower (from 325 up to 407 seconds!).  The only win is to skip running the tests without valgrind, shaving 11 seconds off (about 2%).

One easy thing to do would be to compile with optimization to speed the tests up. Valgrind documentation (and my testing) confirms that using “-O” doesn’t effect the results on any CCAN module, so that should make it run faster, for very little effort.  When I actually measured, total test time increases from 407 seconds to 495, because compiling with optimization is so slow.  Here are the numbers:

  1. Compiling the tests with optimization (-O/-O2/-O3) takes 54/77/130 seconds.
  2. Running the tests with optimization takes 11/11/11 seconds.
  3. Running the tests under valgrind with optimization takes 201/208/208 seconds

So no joy there. Time to go and fix up my tests to run faster, and make ccanlint run (and compile!) them in parallel…

August 14, 2011

Rusty Russell: Professional Photographers and Licensing: Copyright Sucks

So, Alex scoured through wedding photographers, we chose one, met them, got the contract… and it stipulates that they own the copyright, and will license the images to us “for personal use”.  So you pay over $3,000 and don’t own the images at the end (without a contract, you would).  That means no Wikipedia of course, but also no Facebook; they’re definitely a commercial organization.  No blogs with ads.  In the unlikely event that Alex or I change careers and want to use a shot for promotional materials, and the photographer has died, gone out of business or moved overseas, we’re out of luck even if we’re prepared to pay for it.

The usual answer (as always with copyright) is to ignore it and lie when asked.  But despite my resolution a few years ago to care less about copyright, this sticks in my craw.  So I asked: it’s another $1,000 for me to own the copyright.  I then started emailing other photographers, and that seems about standard.  But why?  Ignoring the obvious price-differentiation for professional vs amateur clients, photographers are in a similar bind to me: they want to use the images for promotion, say, in a collage in a wedding magazine.  And presumably, the magazine insists they own the copyright.  Since the photographers I emailed had varying levels of understanding of copyright, I can totally understand that simplification.

Fortunately, brighter minds than I have created a solution for this already: Creative Commons licensing.  On recommendation of one of Alex’s friends, we found a photographer who agreed to license the images to us under Creative Commons Attribution without additional charge; in fact, he was delighted to find out about CC, since the clear deeds make it easier for him to explain to his clients what rights they have.  All win!

July 21, 2011

Rusty Russell: License Boilerplates

CCAN is supposed to be about the code, so I’ve avoided the standard GPL boilerplate comment at the top of each source file.  I reluctantly include a symlink to the full license text in each directory now, since lawyers approached me to clarify the single “License:” line in _info.  A useful discussion on the samba-technical mailing list has reinforced my view that it’s marginal clutter, but most CCAN modules now have a one-line courtesy comment such as “/* Licensed under LGPLv2.1+ – see LICENSE file for details */” at the top of each .c and .h file.

Please make a conscious choice here: if license enforcement is a high priority for your project you probably want copyright assignments, license boilerplates and click-through agreements for everyone who downloads your source code.  But if you’re spending significant time or effort on legal issues for your little coding project, you’re probably doing it wrong…

(ccanlint now scans for common license boilerplates, as well as those comments; this means we can also detect use of incompatible licenses inside modules, or dependent modules.  The former test noticed that I’d labelled the md4 module as LGPL, yet it’s actually GPL.  The latter spotted that ccan/likely (LGPL) depends on ccan/htable (GPL): legal (the whole thing is actually GPL), but misleading, as Michael Adam noted).  Automating this stuff is a clear win for a project like CCAN.  I also re-licensed a bunch of useful-but-trivial modules from LGPL to public domain, as I want the BSD modules to use them).

July 15, 2011

Rusty Russell: Bitcoin for something useful.

I wanted a scalable version of this poster: so I offered 1BTC on the bitcoin forum and someone produced a version I can print and put on my wall in my home office.

Here is the SVG. Enjoy!.

June 06, 2011

Rusty Russell: Waiting For The Great Bitcoin Crash (GBC)

I like bitcoins.  A simple open source client, a well-run developer community, clever algorithms, decentralized assurance model, and of course near-zero transaction fees.  For all the economic arguments (some of which sound like early anti-Wikipedia arguments, though I hesitate to argue by analogy), when I first used it to tip a website, I fell in love.  It took me about 3 minutes to transfer $50 from my bank account to a stranger’s via my bank’s website, with 2 days latency.  It took me about 5 seconds to send 0.1BTC to a stranger, with about 10 minute latency.  It was a revelation.

But, while bitcoins rock, volatility doesn’t.  It’s currently a bit hard to get some bitcoins, and the price has rocketed by speculation.  It’s not just a cute FOSS project, it’s becoming a real market, and surely those piling in now are susceptible to hacking, scams and the inevitable hiccups that go with any project ramping up.  It’s all going to come crashing down to earth.

Good.  My hope is that after the GBC the speculators will move on.  Volatility will settle.  The boring work of accreting trust in this new tool will continue.  I fervently hope that it we will appreciate its true utility once the dust has settled and the Man Loses A Million Dollars and House in Virtual Currency headlines have faded.

[Yes, you can tip me in bitcoins!  I think it's a good habit, fun to do, and better than ads as I get a warm fuzzy feeling of appreciation.  I'll be using any tips I get to tip other sites which take BTC].

May 19, 2011

Rusty Russell: “If you didn’t run code written by assholes, your machine wouldn’t boot”

This was passed on to me by Ben Elliston, ex-gcc hacker and good guy.  Amusing in context, but the corollary is that working on free software means you’ll encounter such people.  You may have to work with them.  You may have to argue with them (and they may be right).

Quite some time ago I was horrified by the private behaviour of a hacker I deeply respected: malicious, hypocritical stuff.  And it caused an internal crisis for me: I thought we were all striving together to make the world a better place.  Here are the results I finally derived:

  1. Being a great hacker does not imbue moral or ethical characteristics.
  2. Being a great coder doesn’t mean you’re not a crackpot.
  3. Working on a great project doesn’t mean you share my motivations about it.

This wasn’t obvious to me, and it seems it’s not obvious to others.  A-list actors endorsing Scientology doesn’t make it a good idea.  Great FOSS political work was done by a certain obnoxious LWN-haunting nutball.  Julian Assange may or may not be guilty of crimes in Sweden. Many of my kernel coworkers believed that GPLv3 was somehow a radical change from GPLv2.  Some sweet code has been written by gun nuts, lechers, holocaust deniers and (in at least once case) someone who believes that fasting will cure cancer.

In any walk of life you have to work with all kinds; having to do so in my dream job as FOSS hacker was a hard lesson for me.  It’s great to work with people whose skills you respect, but don’t expect to like them all.

May 11, 2011

Jesper Dangaard Brouer: Announcing: The IPTV-Analyzer

I'm happy to announce the first official release of the IPTV-Analyzer project, as an Open Source project.



The IPTV-Analyzer is a continuous/real-time tool for analyzing the contents of MPEG2 Transport Stream (TS) packets, which is commonly used for IPTV multicast signals. The main purpose is continuous quality measurement, with a focus on detecting MPEG2 TS/CC packet drops.

The core component is an iptables (Linux) kernel module, named "mpeg2ts". This kernel module performs the real-time Deep Packet Inspection of the MPEG2-TS packets. Its highly performance optimized, written for parallel processing across CPU cores (via RCU locking) and hash tables are used for handling large number of streams. Statistics are exported via the proc filesystem (scalability is achieved via use of the seq_file proc API). It scales to hundreds of IPTV channels, even on small ATOM based CPUs.

Please send bugreports, patches, improvement, comments or insults to: hawk@comx.dk

February 28, 2011

Stephen Hemminger: net-snmp ip-forward table performance problems

Some performance problems are hard and complex, but others seem to be
due to just plain stupidity. This is the saga of SNMP daemon and a
full BGP route table. Way back in 2006, Vyatta discovered that if an
SNMP walk was done on a server with a full BGP route table, it would
peg the CPU and never complete. A full BGP route table is 500K entries
or more so it does a good job of exposing scalability nightmares. The
initial fix was to disable the caching of the route table in SNMP
which made it return no entries. Hardly a good fix, but returning
nothing is better than crashing.

I began investigating with the simple tools of packet capture with wireshark
and syscall capturing with strace. The first discovery was that each
request caused the TCP wrappers library to open and read
/etc/hosts.allow and /etc/hosts.deny. Bogus on two counts:

  1. Debian is shipping the 2 files with no real entries only comments.
    Each packet caused file to be read but there was really no data.
    It would have been better to have the file not exist and have the open fail.
  2. But for our distribution, there was no point in enabling
    TCP wrappers anyway.

The fix was simple to disable tcp-wrappers.

The net-snmp daemon retrieves the ipv4 and ipv6 routing table the old
school way through /proc. This isn't a total disaster but since the
route entries in /proc start with an interface name and net-snmp wants
an ifindex it looks up each entry. That is 300K extra ioctl
calls. Short term hack was to just cache last ifname -> ifindex
translation
; later I replaced it with a netlink route dump
which gives ifindex (surprisingly netlink route dump is already used
in another MIB).

Next observation was that it is stupid to use snmpwalk to walk
the whole system and instead use snmpbulk. This helps but still
the walk would not complete.

The real discovery was when looking at the net-snmp container
code. Internally, net-snmp uses an objectish abstraction to store
data, and the main ones are a flat table and a linked list. The table
is stored in sorted order for fast lookup and sequential access. New
entries are placed at the end of the table and a dirty bit is set for
next lookup. The problem is that each insert also does a lookup for
duplicates which causes a sort
. This makes inserts do quicksort for
each entry -- there is the scalability problem.

To make it more interesting net-snmp creates the route table
twice. First it reads table from /proc and puts entries in one table,
then walks that table to create the cache table used for lookup.

Loading the cache with non-scalable insertion takes several minutes on
a really fast machine, and the cache timeout is 30 seconds. This
ends up causing the CPU load because each request finds a dirty cache
and does a full reload.

Now for the good news, fixing the insert wasn't the hard. The first
step was realizing that the temporary table doesn't have to a table
container, instead it can be changed to a FIFO (linked list). The FIFO
container is O(1) on insert. The actual cache container requires a
different approach. The table container has an unused flag to allow
duplicates in the table. Turning the ALLOW_DUPLICATES flag makes
inserts much faster because the table is not sorted until the first
request. These get the table load down to less than a 5 seconds
on fast machine.

Lastly a couple of other improvements help as well. When the
binary_table is expanded, the code would calloc a new area, copy the
old data and then free the original. This is much worse than just
using realloc which can usually in place expansion when table is
getting large. The sort function can be optimized to avoid calling the
comparison function, and using a faster insertion sort for small sub
sections. These get the load down to less than a second.

Extra credit to the first developer who implements a new net-snmp
container using something better for big tables like AVL or B-tree.

January 10, 2011

Jesper Dangaard Brouer: Bufferbloat: Wireless is worse than expected

Jim Gettys also pointed out, that bufferbloat also exists on Wifi wireless connections[1].

I didn't take his word for it, but tested it my self. And the result was far worse than I expected! In optimal conditions sitting next to the Wifi AP, I can easily introduce a 600 ms delay (0.6 sec), and moving further away I quickly see latency approaching 1 sec. See Gettys tests[1] for more info.

This bufferbloat issue is going to get worse, as we get more bandwidth on our broadband connections. This means that we have not see problem at its full extend yet.
Thus, lets fix it before it gets out of hand!

An interesting property of Wifi bufferbloat is, that the queue/bloat happens on you own machine (when uploading).
Linux is to blame, big time!
An some point we/the Linux kernel developers, increased the default transmit queue length (txqueuelen), from 100 packets to 1000 packets (happend when the netcards went from 100Mbit/s to 1000Mbit/s).

The major problem here is, that wireless devices also inherited this default txqueuelen setting of 1000 packets.

This is fortunately easily fixed/adjusted via e.g the command:
ifconfig wlan0 txqueuelen 10
But unfortunately, this does not remove all the TX bufferbloat in the system. The Wifi driver and hardware also have significant TX buffering, as Getty also describes[2].

Setting the txqueuelen to 1, I still experienced an average delay of 96ms and max 116ms, when linking 24 Mbit/s.

24 Mbit/s * 116 ms = 348000 bytes
= 232 packets (with 1500 bytes packets)
Thus, a hardware bufferbloat on minimum 232 packets. Gettys report his wifi hardware has 255 packets in hardware.

Fixing the hardware Wifi bufferbloat, is harder at we need to fix each individual driver and most of them not exported the functionality to tools like ethtool).
We need to at least mitigate the wifi bufferbloat, to some sane level.
Next we can talk about choosing a queueing stategy AQM for wifi networks.

Links:
[1] http://gettys.wordpress.com/2010/12/02/home-router-puzzle-piece-two-fun-with-wireless/
[2] http://gettys.wordpress.com/2011/01/03/aggregate-bufferbloat-802-11-and-3g-networks/

January 09, 2011

Jesper Dangaard Brouer: Buffer Bloat: The calculations

The buffer-bloat blog posts by Jim Gettys[1][2], are very interesting, and relevant for everybody with a broadband connection (especially asymmetric links). They are well written, but also too long and too many posts.

In this blog post, I'll explain what is going on, by showing how you can calculate your own latency issues.

The issue raised by Gettys, is that buffer-bloat (too big buffers on the network path) has fundamentally broken Internet broadband connections [2].
As buffer-bloat can introduce enough latency to cripple your line. Basically killing the possibility of interactive and realtime services, being delivered (e.g. by companies) over your broadband connection.

I'm very happy to see that Gettys is bring this issue up again.
Back in 2005, I discovered the same issues as Gettys. I wrote my masters thesis[3] about the issue, and even created an Open Source "mitigation" solution the ADSL-optimizer[4]. It seems my solution has not gained a wider use.

The major contribution for my side, is to take the ADSL overhead into account when doing QoS packet shaping. (Everything is in mainline, just use/add the TC options "linklayer adsl" and "overhead", if you already have a Linux box doing QoS on your line.)

The issue is that:
A single TCP upload cause a delay of 1.2 seconds (on a 512 Kbit/s ADSL line)
(see thesis[3] page 21).

Lets calculate what is happening, without going into details of why TCP/IP miss-behaves and cause queues to build (details are in my thesis[3]).

Before starting the calculations, here is a beautiful cite by Jim Gettys[1]:
"Large network buffers can be thought of as 'dark buffers', analogous to 'dark matter' in the universe; they are undetectable under many/most circumstances, and you can detect them only by indirect means. Buffers do not cause problems when they are empty. But when they fill they introduce additional latency (and create other problems, possibly very severe) to other traffic sharing the link."

Given the line speed and the delay, we can calculate the buffer size
(this is the bandwidth-delay product). Due to ADSL overhead the
effective bandwidth is actually 454 Kbit/s of the 512 Kbit/s line,
and the measured delay was 1138 ms.
454 Kbit/s * 1138 ms = 64581 bytes
This, corresponds to the TCP window-size. Thus, this is not the maximum buffer-size of the modem. (Use several TCP connection or UDP to find your maximum ping RTT, and calc your buffer size).

Where does the delay come from?!

The delay consists of different components, the important one in our case is the transmission delay (combined with the packets in queue).

The transmission delay of a 1500 bytes (MTU) packet is:
1500 bytes / 454 Kbit/s = 26.34 ms

Thus, the experienced delay is the time it takes to empty the packets in the queue, which is greatly dependend on the line speed.
E.g. 64000 bytes / 454 Kbit/s = 1127 ms.

With a RTT delay of 150 ms, your interactive SSH connection will feel sluggish.
(A side note on ADSL is that; the processing delay in the ADSL modem can get as large as 60 ms, and is caused by the interleaving depth, but its "fortunately" a constant fixed delay on the path)

Increasing the bandwidth, will reduce the latency, but its not the
solution. Besides, ADSL technology is often limited to a 1024 Kbit/s
upstream link.

The Point:
"ISPs SHOULD configure the buffer size based upon the link bandwidth"

I have a feeling that the ISP just configure a default queue size, and tune the queue size based upon max throughput on their largest product.

The line I did my measurements on, I could see delay on 3.3 sec, thus a buffer-bloat size of 187332 bytes (454 Kbit/s * 3300 ms), or 125 packets at 1500 bytes (MTU). Simply crazy!

For more details on formulas and calculation see thesis page 19 to 27.

Links:
[1] Jim Gettys: introducing-the-criminal-mastermind-bufferbloat
[2] Jim Gettys: whose-house-is-of-glasse-must-not-throw-stones-at-another
[3] http://www.adsl-optimizer.dk/thesis/
[4] http://www.adsl-optimizer.dk/
[5] http://en.wikipedia.org/wiki/Jim_Gettys

December 10, 2010

David Miller: Metrics, metrics, metrics...

First a quick shout-out to the Colbert Nation.

Next, let's talk about route metrics and sharing, shall we?

Metrics exist to specify attributes for a path. For example, what hoplimit should we use when using this route? What should the path MTU be? How about the TCP congestion window or RTT estimate?

Metric settings come from two places.

  • Administrator settings
  • On the fly measurements

The administrator can attach initial metric values to routes when they get loaded into the kernel. This helps deal with specialized situations, but it's not very common at all.

There is a special metric, called "lock" which is a bitmask. There is a bit for each of the existing metrics. If the bit is set, it means that metric should not be modified by the kernel and we should always respect the value the administrator placed there.

The kernel itself will, on the fly, adjust metric values. For example, at the teardown of every TCP connection the kernel can update the metrics on the route attached to the socket. These updates are based upon measurements made during the life of the TCP socket. There is a sysctl to disable this automatic TCP metric update mechanism, for testing purposes.

For a router, metrics don't really change. So, in theory, we could take the defaults stored in the routing table and just reference those directly instead of having a private copy in every routing cache entry.

There are some barriers to this, although none insurmountable.

First of all, in order to share we have to be able to catch any dynamic update so we can unshare those read-only metrics. The net-next-2.6 tree has changes to make sure every metric write goes through a helper function. So we have the traps there and ready to go, problem solved.

Next, we actually change the metrics a little bit when we create every single routing cache entry. Essentially we pre-compute defaults. This is pretty much unnecessary, and actually could theoretically cause some problems in some cases. The metrics in question are the hoplimit, the mtu, and advertised MSS, For ipv4 these are set in rt_set_nexthop.

If the route table metric is simply the default (ie. zero) we pre-calculate it. These calculations are pretty simple and could be done instead when the value of the metric is actually asked for.

Since the defaults are address-family dependent we will need to abstract the calculations behind dst_ops methods, but that's easy enough. Accesses to each of these three metrics then need to go through a helper function which essentially says:

	if (metric_value == 0)
		metric_value = dst->ops->metric_foo_default(dst);
	return metric_value;

With that in place we will rarely, if ever, modify metric values in the routing cache metric arrays. Then the next step is putting unshared metrics somewhere else (f.e. inetpeer cache), and then changing dst_entry metrics member to be a pointer instead of an array.

Initially the pointer will point into the routing table read-only default metrics. On a COW event, we'll hook up an inetpeer entry to the route and retarget the metrics pointer to the inetpeer's metrics storage.

November 21, 2010

David Miller: IPv4 source address selection in Linux

Often when we connect a socket up, or bind it, we really don't care what source address is used for the resulting connection. We let the kernel decide.

The usual sequence is:

	fd = socket(PF_INET, SOCK_{STREAM,DATAGRAM}, IPPROTO_{TCP,UDP});
	connect(fd, { AF_INET, $PORT, $DEST_IP_ADDR }, ...);
Here we leave the source port and source address to be selected by the kernel via a facility called auto-binding. Oddly enough TCP and UDP use a different ordering for selecting the source address vs. selecting the source port.

UDP will first select a local port to use, and make this choice in a global namespace of ports for the machine. TCP on the other hand will have a source address selected first, and then try to allocate a local port using the source address as a partial key. This latter ordering is necessary in order to handle SO_REUSEADDR correctly.

Source address selection itself happens via routing.

The route lookup will be performed with the source address in the flow lookup key set to zero. After the route, based upon destination address, is found the routing code uses the next-hop interface to select an appropriate source address.

All of these results are propagated into the routing cache entry.

It is interesting to note that the routing cache entry created in such a situation will have a zero source address as well in it's routing key. So the next time a routing lookup occurs to the same destination, but without a specified source-address, we'll match this routing cache entry.

This little detail creates some minor complications when handling ICMP messages for redirects. Since we must update any potentially matching routing cache entries, we have to probe the hash table multiple times. Once with an explicit source address in the lookup key, and once with the source address in the key set to zero. Otherwise we won't update all of the entries that we need to.

Actual source address selection is performed by inet_select_addr(). Either via direct calls made by net/ipv4/route.c, or indirectly via __fib_res_prefsrc() This function works with a "scope" specification which says which realm in which the source address must be valid. Most of the time this is RT_SCOPE_UNIVERSE.

The linked list of ipv4 interface addressed for the interface is traversed, and the first address with an appropriate scope is selected.

Even though the flow key of the routing cache entry will have a zero source address, the source address selected is remembered in rt->rt_src so that users of it can see what source address to use.

Finally, routes loaded into the kernel can have an explicit "preferred source address" attribute set by the administrator. This value will fully preempt whatever inet_select_addr() would have choosen.

October 01, 2010

Jesper Dangaard Brouer: tcpdump vs VLAN tags

In this post I will explain why tcpdump sometimes doesn't show the VLAN (802.1q) tags, when dumping network traffic.

The issue is related to libpcap and VLAN hardware acceleration in the Linux kernel.

Patrick McHardy, fixed the kernel in version 2.6.27, and also fixed libpcap.
His patches for libpcap has the subject:
"libpcap: VLAN acceleration support".
Which got accepted, back in Aug 2008.

So whats the problem?

The problem is that tcpdumping VLAN IDs works under Ubuntu, but NOT under Debian.

I'm running a new kernel 2.6.35 on the Debian machine (which has the support >2.6.27). So I downloaded the libpcap and tcpdump source and compiled it manually, BUT it still didn't work!?! (even with the latest libpcap 1.1.1).
This was strange!
Digging into the cause of the issue, I found that is related to an old include file on the Debian system.

/usr/include/linux/if_packet.h

This old header file, cause the libpcap configure command to disable the libpcap VLAN feature. When running configure for libpcap, look for:

"checking if tpacket_auxdata struct has tp_vlan_tci member... no"

Remember this issue related to all programs using libpcap, which also includes Wireshark.

I hope confused VLAN dump users, will find this blogpost...

Update

The binary tcpdump on Debian Squeeze actually works with hardware accelerated VLAN, although you have specify option '-e' to make it show the VLAN tags.
BUT here it gets really strange...
It will NOT work if you compile libpcap and tcpdump you self, on Debian Squeeze (GCC 4.4.5).
The header file if_packet.h contains the correct structs, but libpcap's configure command will refuse to enable the VLAN code, as the check (of if_packet.h) fails.

The problem this time is related to the compiler version.

GCC 4.4 does not indirectly include the type of u_int, which the configure code choose to use for this test. Thus, the if_packet check fails due to the error: ‘u_int’ undeclared, which it interpreted as no VLAN support. Thus, the reason the binary works on Squeeze is that the maintainer has compiled it with an older version of GCC. Argh!

When I find the time, I'll post some patches to the maintainer of libpcap.

August 30, 2010

David Miller: How GRO works

All modern device drivers should be doing two things, first they should use NAPI for interrupt mitigation plus simpler mutual exclusion (all RX code paths run in software interrupt context just like TX), and use the GRO NAPI interfaces for feeding packets into the network stack.

Like just about anything else in the networking, GRO is all about increasing performance. The idea is that we can accumulate consequetive packets (based upon protocol specific sequence number checks etc.) into one huge packet. Then process the whole group as one packet object. (in Network Algorithmics this would be principle P2c, Shift computation in time, Share expenses, batch)

GRO help significantly on everyday systems, but it helps even more strongly on machines making use of virtualization since bridging streams of packets is very common and GRO batching decreases the number of switching operations.

Each NAPI instance maintains a list of GRO packets we are trying to accumulate to, called napi->gro_list. The GRO layer dispatches to the network layer protocol that the packet is for. Each network layer that supports GRO implements both a ptype->gro_receive and a ptype->gro_complete method.

->gro_receive attempts to match the incoming skb with ones that have already been queued onto the ->gro_list At this time, the IP and TCP headers are popped from the front of the packets (from GRO's perspective, that actual normal skb packet header pointers are left alone). Also, the GRO'ability state of all packets in the GRO list and the new incoming SKB are updated.

Once we've committed to receiving a GRO skb, we invoke the ->gro_complete method. It is at this point that we make the collection of individual packets look truly like one huge one. Checksums are updated, as are various private GSO state flags in the head 'skb' given to the network stack.

We do not try to accumulate GRO packets infinitely. At the end of a NAPI poll quantum, we force flush the GRO packet list.

For ipv4 TCP there are various criteria for GRO matching.

  • Source and destination address must match
  • TOS and protocol fields must be the same
  • Source and destination ports must match
Certain events cause the current GRO bunch to get flushed out. For example:
  • ID field not being in sequence with existing packets
  • Don't fragment bit clear
  • TCP CWR congestion indication being set
  • TCP ACK sequence mis-match
  • Any TCP option mis-match
  • TCP sequence not being in-order

The most important attribute of GRO is that it preserves the received packets in their entirety, such that if we don't actually receive the packets locally (for example we want to bridge or route them) they can be perfectly and accurately reconstituted to the transmit path. This is because none of the packet headers are modified (they are entirely preserved) and since GRO requires completely regular packet streams for merging, the packet boundary points are known precisely as well. The GRO merged packet can be completely unraveled and it will mimmick exactly the incoming packet sequence.

GRO mainly the work of Herbert Xu. Various driver authors and others helped him tune and optimize the implementation.

August 26, 2010

David Miller: Converting sk_buff to list_head.

I've been trying to make this happen, off and on, for at least two years now. Most of the kernel is straightforward and uses the skb_*() interfaces we have for manipulating skb objects on a list.

So for those, simply tweaking the interfaces in skbuff.h will make them all "just work".

However there are a few other spots in the kernel which manipulate the SKB list pointers directly:

  • SKB fragment lists have a head of skb_shinfo(skb)->frag_list of the head skb and use only skb->next for linkage.
  • The GRO handling uses both ->prev and ->next with a single pointer head at napi_info->gro_list
  • Both ISDN PPP and the generic PPP code have a fragmentation handling engine which manipulates the SKB list pointers directly. I've very nearly converted the ISDN side to use the standard skbuff.h list interfaces, but it added regressions and I had to eventually revert.
  • The socket backlog handling uses a by-hand coded FIFO tail queue list of SKBs.

I'm taking another stab at this, and hopefully I can work out these wrinkles. It'd be a really nice change because of lot of uses of "struct sk_buff_head" which don't care about the spinlock or the packet count can be converted to simply "list_head" saving serious space in various datastructures.

Copyright (C) 2001-2010 by the respective authors.