June 4, 2012

How to simulate high CPU load on Unix

Sometimes you want to stress a server by simulating high CPU load. Friend of mine tried to write a Fibonacci sequencer, but there is a better way. GNU Coreutils contains relatively unknown "yes" program.
yes prints the command line arguments, separated by spaces and followed by a newline, forever until it is killed. If no arguments are given, it prints ‘y’ followed by a newline forever until killed.
This gives us exactly what we want, an endless loop to occupy the processor. If you have only one processor core. Following is sufficient:


$ yes > /dev/null


Redirecting to /dev/null is important because we don't want to spend time waiting on IO. We want processor looping a fast as possible.
In case you have more than one processor core, which most of the modern CPUs do. Running one instance is not enough. To load all the cores, at least an equivalent amount of instances is required. How many cores you computer has? If you on linux, grepping /proc/cpuinfo will give you that information.


$ grep processor /proc/cpuinfo | wc -l
4


Combining what we know now. It is trivial to write a one-liner to max out all cores on the machine.


$ CN=`grep processor /proc/cpuinfo | wc -l`; \
  for i in `seq 1 $CN`; do \
  yes >/dev/null & \
  done


To stop the load if those are the only jobs you're running
$ kill `jobs -p`


or
$ killall yes

February 2, 2012

The name behind the proxy

I had an increase in visits to my blog recently due to Haaretz publication of my findings. Looking over statistics i noticed several visits from linkedin where i also have a profile. One entry have got my attention, someone from R&D at PeerApp looked at my profile. After checking their site and applying Occam's razor to all the data I'm quite sure at this point that this is exactly what BezeqInt uses as a "sneaky proxy". Also Netvision as mentioned before and Pelephone(see press release).

Schematic from PeerApp site.
http://www.peerapp.com/Solutions/Network.aspx

Makes me wonder how widespread this cache poisoning problem is. Is it a configuration problem? Is it  inherent to PeerApp technology? I had an agreement with Haaretz to hold on and not publish my findings giving BezeqInt time to patch and/or reconfigure their system. Even after disclosing exact details to their technical staff it seems like nothing has being done. So far BezeqInt denied the viability of poisoning the cache. My factual tests with other people using BezeqInt lines prove otherwise.

January 18, 2012

BezeqInt cache poisoning demonstration


Update 3, 15/2/2012: Demo is closed.

Update 2, 29/1/2012: What i though was a fix is not. It looks like BezeqInt is trying to block my demo instead of fixing the problem. I can't be sure though. So i made another demo. You need to be in 109.65.*.* range now and fetch http://i.imgur.com/hggke.jpg
Original MD5: be902241dc59f19950cf4f14f6a4f33e
Poisoned MD5: 7b0e031ba89106b8cd2a988aadfedb8b

Update, 29/1/2012: Demo doesn't work anymore. Apparently BezeqInt finally fixed it, 29 days after it has been notified. I'll be checking it more thoroughly later.

As a continuation to my sneaky proxy series of post i've prepared a little demonstration of cache poisoning. I uploaded a png image to imgur.com and poisoned BezeqInt's proxy cache with a bit different file. While my proof of concept is with png image it is as easy to do it with .exe or any other type of a file.

In order to get the poisoned file from the cache you need to be in 79.18?.*.* range. Go to whatismyip.org and check if your IP starts with 79.18?... If not reboot your router until you get to this range. Next download this image http://i.imgur.com/beUai.png and check MD5 checksum.
If you are a linux/macos user run:

> md5sum beUai.png

Otherwise you can use this service to calculate the checksum by uploading the file.

md5 of the original file at imgur.com:
4a1d744319af6d598f836da5d0e3e979  beUai.png

md5 of my "poisoned" file that i forced proxy to cache:
ab5192e9077074029b35d5d15de6cf05  beUai.png

If you want to download the original file you need to be in some other range or better use a proxy abroad.
Write to me what you get.

January 3, 2012

Speed test?

As we all know now, BezeqInt has a quite invasive caching proxy. I wondered how does it affect broadband speed test applications out there. Many of us used them at least once. So i made a google search and tried a dozen of those on my 15Mbps line. You can see www.speedtest.net results on the left.

Sydney
Tokyo
New York
Wherever i download from speed is always ~15Mbps. Of course it is too good to be true. Most of the application i've tried used usual http connection to download a test file. Speedtest.net is not an exception so it wasn't a surprise when i found that all the files used in  measurements were served by BezeqInt proxy rendering the results quite meaningless. Most interesting part is that speedtest.net generates  unique urls for measurements although downloaded file is always the same. Which begs the question: how is it possible for proxy to have this file if url is completely new? Two most obvious answers: proxy has been configured that way or it learned from the usage pattern to "optimize" access to this resource. Any way results are useless because i want to measure my actual download speed from that specific location not the speed from some internal BezeqInt server.

If you want to test your download speed from certain locations i'd recommend myspeed.visualware.com. This software doesn't use the usual http port therefore doesn't go through proxy and provides results much closer to the truth.

January 1, 2012

Sneaky proxy, part 2

In my previous post i gave some details about BezeqInt no opt-out proxy. Being quite technical in nature i feel it didn't explain what is going on for a casual internet user. I hope to fix it with this post and also share some of my new findings.

This proxy stands between you and the internet, scanning your web traffic passively and even saving some of it for later use, hence "caching proxy". It uses some advanced technology to stay invisible but still able to collect and reconstruct files from the traffic passing by, namely DPI. While DPI has recently got a bad vibe in Israel, i don't believe in inherently evil technology. Technology is neither good nor evil. It's the application that may or may not violate the moral boundaries. Quite common analogy to be used while explaining DPI is the post office example. All the information post worker needs to deliver your mail is on the envelope, sending and receiving address. DPI is akin to opening the letter and inspecting its content. Mind you, this analogy is not without defects. Mail packages a routinely screened, etc. I, myself, deployed and used DPI tech in my line of work. It helps to classify traffic prior shaping, detect malicious intent, gather statistics and forensic information. All this has been done for years by private enterprises in order to better control and protect their networks.

My research shows that all three major Israeli providers (BezeqInt, Netvision, 012 Smile) reroute a usual web traffic through their DPI systems. I didn't contact Netvision, but both BezeqInt and 012 Smile refused to even acknowledge the existence of the system claiming that they abandoned such practices years ago. This is not satisfactory and in my book constitutes lying or support stuff is just as ignorant as i was prior my discovery. Even appeal to security, since the way it is done looks awfully similar to man-in-the-middle attack, produced no results. So i stopped wasting my time and money on useless phone calls and emails. Instead, having some free time lately, i concentrated on researching a bit more. The result was my "Tale of a sneaky proxy" post. Since then i made some progress and some of the findings are quite alarming.

As any other software proxies are not exempt from bugs or miscofiguration. Moreover HTTP protocol wasn't build with any kind of transparent proxies in mind, that's why your browser has proxy settings. That is especially true for caching proxies. The Internet Engineering Task Force (IETF) even has a RFC document detailing some of the common problems and workarounds (RFC3143). Given a clever implementation of the proxy at hand i guess it does avoid some of the problems but may introduce others. Most of my experiments were done using BezeqInt ADSL line so the following results and conclusions may or may not be pertinent to other ISPs.

During the exploration i had a feeling that BezeqInt proxy was quite frivolous about what content it's caching. It had no problem caching files that require authentication before downloading. Let me elaborate. You are given a link to a file that asks for username and password to download. You provide your authentication and download the file. Now guess how many copies of the file were made during transmission. Two! One for you and one for BezeqInt proxy cache. I don't think that someone's trying to steal your data. My guess would be - a plain negligence in proxy configuration. That doesn't comfort me either, though. I'd think that many BezeqInt customers would prefer to be aware of such intricacies.
But that's not the worst part of my findings. After looking at all the data i decided to run a simple "cache poisoning" scenario for one of my domains. To my dismay i've succeeded! I was able to force the proxy to cache my version of the file from my other domain. Now imagine this coupled with malicious intent. One can use BezeqInt caching proxy as distribution network for a viruses, trojans, you name it. You think you are downloading some .exe file from a trusted site, nope. It's hackers modified version of the same file that he forced the proxy to cache. You wouldn't know any better. I notified BezeqInt about my findings and i hope it will be fixed soon.

After doing all this research i find myself in a peculiar situation. I can't opt out of the system that "doesn't exist". Changing providers wont help either (may be there are other, smaller ISPs that don't do this kind of traffic surveillance). I have no idea how many customers are affected and the secrecy that this system is surrounded with doesn't give any opportunity to know exactly what has being done with all this surfing data that has being collected or not.

I still have questions that no one seems to have answers for. Should BezeqInt be allowed forcibly reroute its customers through a system like that with no opt-out mechanism? Is there any other purpose besides caching? Should be there any oversight or regulations over this system? I don't know and I'd be happy to hear from someone with insight.

December 22, 2011

Tale of a sneaky proxy

It has come to my attention that my ISP has a caching proxy server. The ISP i'm talking about is Bezeq International, one of the big Israeli internet service providers. Proxying in itself is a quite common occurrence and a useful technology. What got my attention is how they do it.

Fig.1: Scapy TCP SYN trace
I was trying to find a good Ubuntu mirror when i felt there is a proxy. While test ls-lR download has maxed out my 10Mb line, updates were crawling at 20KB per second. Browser connects to internet directly, what gives? My ISP has set up an intercepting proxy. And i thought this practice had been abandoned years ago. I had to check.

A common intercepting proxy is visible in tcp traceroute output, but not this one (Fig.1 on the right). Instead what we see is a detour of 3 hops for traffic on port 80 . One of them answers using private IP, what indicates that our SYN packet passed an internal link. This is not unheard of, we still in BezeqInt network after all, but this path take only traffic for WWW. While this detour is symptomatic to a special treatment, we don't see the proxy in action. Tcpdump to the rescue. Capturing traffic on both ends of the connection shall give us enough insight to understand what is going on.
In order to trigger the proxy i put a random file on my web server abroad and downloaded it several times, until it max out my line. Server capture file showed that only first 3 transfers got there. Last 2 are very short and were reset (Fig.2). Which is weird since i finished all transfers successfully.

Fig.2: Server side TCP conversations.

Client side conversations show the expected result. Looks like transfers got here just fine.(Fig 3)

Fig.3: Client side TCP conversations.

Let's look at 4th conversation. It was finished very abruptly on the server but client got the full file!
First from the client perspective(fig. 4). Wireshark filter shows only conversation 4 and only packets from the server. I added IP TTL, IP ID and Frame Length columns so you can see the difference better

Fig.4: Conversation #4, client perspective.

As you can see, packets marked white are very different from packets marked green. In fact green packets are legit from my server. White, on the other hand, were sent by something else.

IP TTL
I used linux on both ends of conversation and both of them use TTL of 64. Our mysterious stranger though uses TTL of 255 it looks like. In my practice TTL of 255 used mostly for ICMP packets and 64 or 128 for TCP packets. It make sense to allow service messages to travel longer distances than TCP. Also 255 is a maximum value IP TTL can be. This means that our perpetrator can't be further than 4 hops away! Which puts him somewhere in detour our packets made during traceroute.(fig.1). I only know one OS that use TTLs like that - SunOS/Solaris.

IP ID
IP ID is used for the purposes of fragmenting packets. Linux AFAIK always writes ID for IP packets even if "Don't Fragment" bit is set. Our perpetrator doesn't. Which is fine, i guess, since by RFC packets that don't fit into the "pipe" and have DF set shall be dropped with ICMP notification sent to the IP source.

Frame length
There is a difference of 52 bytes in frame length. That may indicate a tunnel that those packets were squeezed through. Probably GRE tunnel.

At this point there is no doubt that this connection was intercepted and file i was downloading had been fed from a server in close proximity. Still let's look at the dump from my server.

Fig.5: Conversation #4, server side.

Those white packets look familiar. Same initial TTL, same 0 ID. Checks if connection is up and resets it.

Another thing to note is that with every conversation after it has been setup, packet path from server to client increases by 4 hops. I simple terms, file you are downloading has being rerouted through cache servers, as they want to save it without making a separate connection to the server.

Fig.6: TTL drop from 44 to 40.

After looking at all this i had an idea. Why not to trace inside of the connection using TCP ACKs?! As with most of my great ideas someone already done that. This technique called "subliminal traceroute", first public implementation called 0trace. I used one by name intrace. This util should produce the output i was expecting from my first TCP trace. Here is one i ran on my client machine.
InTrace 1.5 -- R: Server/80 (80) L: Client/58990
Payload Size: 1 bytes, Seq: 0x6e1c3ebc, Ack: 0x63bd8df4
Status: Press ENTER                                                        

  #  [src addr]         [icmp src addr]    [pkt type]
 1.  [192.168.1.1    ]  [Server         ]  [ICMP_TIMXCEED]
 2.  [212.179.37.1   ]  [Server         ]  [ICMP_TIMXCEED]
 3.  [81.218.103.194 ]  [Server         ]  [ICMP_TIMXCEED]
 4.  [212.179.152.130]  [Server         ]  [ICMP_TIMXCEED]
 5.  [Server         ]  [  ---          ]  [TCP]

Busted.

In conclusion

What we have here is a very sneaky caching proxy. It saves files that go through, does it by eavesdropping on http connections and disrupts communication when it wants to serve the file directly. I didn't test the caching mechanism, but in theory it may disrupt sensitive to caching services. If you experienced problems like that try using another port https for example.

BezeqInt support representatives refused to acknowledge forcing this proxy on me. All BezeqInt lines I've checked go through this proxy. I think this eavesdropping on communications is a violation of privacy. It might even be illegal, i'm not sure, i'm not a lawyer. I know many buy these lines. They might want to reconsider knowing that file you just downloaded might have been saved for later use by your ISP.

PS: By the way, BezeqInt is not the only ISP in town who does this.

December 4, 2010

Mike Galbrait's patch on Maverick, how to

You probably already heard of now famous ~200 lines patch for linux kernel by Mike GalbraitContinuing Con Kolivas saga somewhat more successfully in terms of acceptance into mainline kernel. The patch, albeit only around 200 lines of C code, shows remarkable increase in desktop responsiveness. If you don't know what i'm talking about, check out Phoronix article about the patch. They even have videos showing off the difference.
Today i'll show you how to bring this marvelous patch on to your own Ubuntu desktop. We are not going to compile anything. We are going to use Ubuntu Natty kernel package that has the patch applied by nice folks at Canonical.


1. Download kernel package 
Go to packages.ubuntu.com and search for linux-imageIt's a virtual package providing list of actual kernel packages. Choose the latest. At time of this writing it's linux-image-2.6.37-7-generic. At the bottom of the page you'll find download links. Choose you architecture, mirror and download the package. 


2. Installation
Double click on downloaded package and choose Install.
sudo dpkg -i <package name> should work just fine too.


3. Reboot


That's it folks! You should be running the new kernel now.
If you want to be sure. Open up terminal and run uname -r. I got 2.6.37-7-generic which is the kernel i downloaded and installed.
Also, check with the kernel that we indeed running the new feature.


$ sysctl kernel.sched_autogroup_enabled
kernel.sched_autogroup_enabled = 1


Enjoy low latency desktop :)