Jim Lawless' Blog


PHP, Transparent GIF's, and Web Tracking

Originally published on: Thu, 30 Apr 2009 00:58:16 +0000

When navigating the web, one often encounters tracker applications. "Trackers" are simply web programs which are triggered by IMG tags in the given web page. The SRC attribute for these tags usually refers to a web program rather than a static image. The web program then generates the binary data for either a one-pixel wide transparent GIF or PNG image so that it blends into the background of the web page.

There's really not a whole lot of effort involved in building one's own tracker. I've written a sample tracker in PHP and will suggest techniques to ensure proper usage in web pages.

tracker.php


<?php
# License: MIT / X11
# Copyright (c) 2009 by James K. Lawless
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation
# files (the "Software"), to deal in the Software without
# restriction, including without limitation the rights to use,
# copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following
# conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
# OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.

   # write out pertinent headers to let the browser know
   # that we're delivering a 43-byte GIF image
header("Content-type: image/gif");
header("Content-length: 43");
   # write one-pixel wide tranparent GIF to the output
   # stream
$fp=fopen("php://output","wb");
fwrite($fp,"GIF89a\x01\x00\x01\x00\x80\x00\x00\xFF\xFF",15);
fwrite($fp,"\xFF\x00\x00\x00\x21\xF9\x04\x01\x00\x00\x00\x00",12);
fwrite($fp,"\x2C\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02",12);
fwrite($fp,"\x44\x01\x00\x3B",4);
fclose($fp);
   
  # append the hit to a log file
  # you'll probably want to replace this with an INSERT to your
  # favorite database
$ref = $_SERVER['HTTP_REFERER'];
$addr = $_SERVER['REMOTE_ADDR'];
$host = $_SERVER['REMOTE_HOST'];
$brow = $_SERVER['HTTP_USER_AGENT'];
$page=$_GET["page"];
$log=fopen("log.txt","a");
$t=date("r");
fwrite($log,"$t : page=$page $ref $addr $host $brow \r\n");
fclose($log);
?>

First, I created a transparent GIF file one pixel wide using Microsoft Paintbrush and IrfanView. The transparent GIF image is forty-three bytes in size.

I then translated the values for this image into a series of PHP strings using the hexadecimal escape-sequence prefix \x to embed most of the values in the strings.

The problem with having a byte with the value zero in a string is that PHP implementations are mostly constructed in C/C++ ... so this byte would act as a string terminator if we try to use the PHP echo command to output these strings.

I was unclear as to how I would write binary data back to the browser in PHP, but it turned out to be a painless process.

I used the fwrite() function which allowed me to specify the number of bytes I was writing out each time. The zero bytes were included in each as fwrite() treats the string like a fixed-size buffer when the length parameter is supplied.

In order to deliver the data to the browser, I opened a special PHP file php://output with the fopen() function. This special PHP file represents the buffered output that will be transmitted to the browser requesting the page.

After sending the GIF data using four calls to fwrite(), I close the filehandle used as output.

The next section of code obtains a few key fields from the browser that might be useful for tracking purposes ( the HTTP reffering address, the remote host name, the remote host IP address, the browser description ( user agent ), and a special field that we will append as a parameter onto the end of our URL for the tracker PHP page.

In this example, I simply append this message to the file log.txt. You will probably want to change this code so that this data is logged in a database.

So, let's assume that we have the PHP page tracker.php placed on a web server on the local machine. You'll probably want to create the log.txt file in the same folder and ensure that you've given the web system user ID appropriate permissions to write to the file in Windows or have chmod'ed the file in Unix/Linux so that it can be written to.

To test the PHP page to see if it worked, I used a Windows port of the Unix wget utility. ( wget is a command-line utility for retrieving documents via the HTTP protocol. ) I added the verbose option to the command-line so that I could see detailed conversation with the web server.

The command-line I issued was: wget -v http://localhost/tracker.php?page=first

The output from wget was as follows: --19:22:53-- http://localhost/tracker.php?page=first => `tracker.php@page=first' Resolving localhost... 127.0.0.1 Connecting to localhost|127.0.0.1|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 43 [image/gif]

100%[====================================>] 43 --.--K/s

19:22:53 (218.48 KB/s) - `tracker.php@page=first' saved [43/43]

The content of my log entry in log.txt read as follows: Wed, 29 Apr 2009 19:22:53 -0500 : page=first 127.0.0.1 Wget/1.10.2

Note that the parameter we appended to the end of the url page=first was written to the log because of the output line:


fwrite($log,"$t : page=$page $ref $addr $host $brow \r\n");

I had prefixed the output of the $page variable with the string "page=" so it happens to look exactly the same as the way we specified it with both the key and value in the URL.

This page parameter should be used to identify the page you are tracking. If you have a tracker IMG tag on your main index HTML file, you might use the url tracker.php?page=main-index or something similar.

To embed this link in a web page, simply add the tag: <IMG SRC="http://yourserver.goes.here/tracker.php?page=somepage" >

Try to visit your page with a browser and you should see a new entry in your log. Unfortunately, if we use this simple tag in the browser, we are unlikely to log visits in the immediate future because the browser is likely to cache the image.

While there are a number of recommendations for generating different kinds of caching headers and such, I've found that the most reliable way to ensure that the browser tries to re-acquire the image each time is to subtly change the URL. I do this by appending a timestamp in milliseconds as an additional parameter t that we never reference in the PHP script.

In order to append this value, we have to use some JavaScript to build the IMG tag for us. Here's what the script might look like:


<script type="text/javascript">
   document.write(
      '<IMG SRC="http://yourserver.goes.here/tracker.php?page=somepage&t='
      + new Date()*1 + ">");
</script>

Which would generate an IMG tag that might look something like


<IMG SRC="http://yourserver.goes.here/tracker.php?page=somepage&t=1241052086453">

The code I've provided is really intended to be a starting-point; you should modify it to suit your needs by adding/removing fields from the logging and such. In some cases, you might want to generate an e-mail when a particular value for the $page variable is detected.

Unless otherwise noted, all code and text entries are Copyright ©2009 by James K. Lawless

del_icio_us Save to del.icio.us
stumbleupon Save to StumbleUpon
digg Digg it
reddit Save to Reddit
facebook Share on Facebook
twitter Share on Twitter
aolfav More bookmarks



Previous post: Envy
Next post:RSS feed processing with AWK


Search this Blog (and site)

Search this Site with PicoSearch


Subscribe to this Blog

 Subscribe!


Contact Me

Email: jimbo@radiks.net


Follow me on Twitter

http://twitter.com/lawlessGuy


Recent Posts

Mad Schemes : Learning Lisp via SICP

Auto Save Clipboard Images Redux

Extending SpiderMonkey JavaScript on Windows

Rhino JavaScript to EXE with launch4j

Compiling Rhino JavaScript to Java

Directory Traversal in Rhino JavaScript

Taking Shape

We've Moved!


Popular Posts

A Command-Line MP3 Player for Windows

Auto Save Images from the Clipboard

Java in a Windows EXE with launch4j

An Interview with Tom Zimmer: Forth System Developer

Setting Windows Console Text Colors in C


Random Posts

Expanding Shortened URL's

Changing the C64 Text Color in C

Choose your own Adventure with Sinatra

Understanding TRS-80 CMD Files

An Interview with the Creator of the BDS C Compiler

Twimmando No More

Mad Schemes : Learning Lisp via SICP

A Scrolling Banner using Canvas and JavaScript

The Protection Racket

A Simple Associative Array Library in C


Full List of Posts

http://www.mailsend-online.com/bloglist.htm


Blogroll

MicroISV on a Shoestring
DadHacker
The Bottom Feeder
Writin' That Code!
The Recursive ISV
The Thomsen Blog
Prototypically Speaking
The Reinvigorated Programmer