Jim Lawless' Blog


PHP, Transparent GIF's, and Web Tracking

Originally published on: Thu, 30 Apr 2009 00:58:16 +0000

When navigating the web, one often encounters tracker applications. "Trackers" are simply web programs which are triggered by IMG tags in the given web page. The SRC attribute for these tags usually refers to a web program rather than a static image. The web program then generates the binary data for either a one-pixel wide transparent GIF or PNG image so that it blends into the background of the web page.

There's really not a whole lot of effort involved in building one's own tracker. I've written a sample tracker in PHP and will suggest techniques to ensure proper usage in web pages.

tracker.php


<?php
# License: MIT / X11
# Copyright (c) 2009 by James K. Lawless
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation
# files (the "Software"), to deal in the Software without
# restriction, including without limitation the rights to use,
# copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following
# conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
# OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.

   # write out pertinent headers to let the browser know
   # that we're delivering a 43-byte GIF image
header("Content-type: image/gif");
header("Content-length: 43");
   # write one-pixel wide tranparent GIF to the output
   # stream
$fp=fopen("php://output","wb");
fwrite($fp,"GIF89a\x01\x00\x01\x00\x80\x00\x00\xFF\xFF",15);
fwrite($fp,"\xFF\x00\x00\x00\x21\xF9\x04\x01\x00\x00\x00\x00",12);
fwrite($fp,"\x2C\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02",12);
fwrite($fp,"\x44\x01\x00\x3B",4);
fclose($fp);
   
  # append the hit to a log file
  # you'll probably want to replace this with an INSERT to your
  # favorite database
$ref = $_SERVER['HTTP_REFERER'];
$addr = $_SERVER['REMOTE_ADDR'];
$host = $_SERVER['REMOTE_HOST'];
$brow = $_SERVER['HTTP_USER_AGENT'];
$page=$_GET["page"];
$log=fopen("log.txt","a");
$t=date("r");
fwrite($log,"$t : page=$page $ref $addr $host $brow \r\n");
fclose($log);
?>

First, I created a transparent GIF file one pixel wide using Microsoft Paintbrush and IrfanView. The transparent GIF image is forty-three bytes in size.

I then translated the values for this image into a series of PHP strings using the hexadecimal escape-sequence prefix \x to embed most of the values in the strings.

The problem with having a byte with the value zero in a string is that PHP implementations are mostly constructed in C/C++ ... so this byte would act as a string terminator if we try to use the PHP echo command to output these strings.

I was unclear as to how I would write binary data back to the browser in PHP, but it turned out to be a painless process.

I used the fwrite() function which allowed me to specify the number of bytes I was writing out each time. The zero bytes were included in each as fwrite() treats the string like a fixed-size buffer when the length parameter is supplied.

In order to deliver the data to the browser, I opened a special PHP file php://output with the fopen() function. This special PHP file represents the buffered output that will be transmitted to the browser requesting the page.

After sending the GIF data using four calls to fwrite(), I close the filehandle used as output.

The next section of code obtains a few key fields from the browser that might be useful for tracking purposes ( the HTTP reffering address, the remote host name, the remote host IP address, the browser description ( user agent ), and a special field that we will append as a parameter onto the end of our URL for the tracker PHP page.

In this example, I simply append this message to the file log.txt. You will probably want to change this code so that this data is logged in a database.

So, let's assume that we have the PHP page tracker.php placed on a web server on the local machine. You'll probably want to create the log.txt file in the same folder and ensure that you've given the web system user ID appropriate permissions to write to the file in Windows or have chmod'ed the file in Unix/Linux so that it can be written to.

To test the PHP page to see if it worked, I used a Windows port of the Unix wget utility. ( wget is a command-line utility for retrieving documents via the HTTP protocol. ) I added the verbose option to the command-line so that I could see detailed conversation with the web server.

The command-line I issued was: wget -v http://localhost/tracker.php?page=first

The output from wget was as follows: --19:22:53-- http://localhost/tracker.php?page=first => `tracker.php@page=first' Resolving localhost... 127.0.0.1 Connecting to localhost|127.0.0.1|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 43 [image/gif]

100%[====================================>] 43 --.--K/s

19:22:53 (218.48 KB/s) - `tracker.php@page=first' saved [43/43]

The content of my log entry in log.txt read as follows: Wed, 29 Apr 2009 19:22:53 -0500 : page=first 127.0.0.1 Wget/1.10.2

Note that the parameter we appended to the end of the url page=first was written to the log because of the output line:


fwrite($log,"$t : page=$page $ref $addr $host $brow \r\n");

I had prefixed the output of the $page variable with the string "page=" so it happens to look exactly the same as the way we specified it with both the key and value in the URL.

This page parameter should be used to identify the page you are tracking. If you have a tracker IMG tag on your main index HTML file, you might use the url tracker.php?page=main-index or something similar.

To embed this link in a web page, simply add the tag: <IMG SRC="http://yourserver.goes.here/tracker.php?page=somepage" >

Try to visit your page with a browser and you should see a new entry in your log. Unfortunately, if we use this simple tag in the browser, we are unlikely to log visits in the immediate future because the browser is likely to cache the image.

While there are a number of recommendations for generating different kinds of caching headers and such, I've found that the most reliable way to ensure that the browser tries to re-acquire the image each time is to subtly change the URL. I do this by appending a timestamp in milliseconds as an additional parameter t that we never reference in the PHP script.

In order to append this value, we have to use some JavaScript to build the IMG tag for us. Here's what the script might look like:


<script type="text/javascript">
   document.write(
      '<IMG SRC="http://yourserver.goes.here/tracker.php?page=somepage&t='
      + new Date()*1 + ">");
</script>

Which would generate an IMG tag that might look something like


<IMG SRC="http://yourserver.goes.here/tracker.php?page=somepage&t=1241052086453">

The code I've provided is really intended to be a starting-point; you should modify it to suit your needs by adding/removing fields from the logging and such. In some cases, you might want to generate an e-mail when a particular value for the $page variable is detected.

Unless otherwise noted, all code and text entries are Copyright ©2009 by James K. Lawless



Views expressed in this blog are those of the author and do not necessary reflect those of the author's employer. Views expressed in the comments are those of the responding individual.

stumbleupon Save to StumbleUpon
digg Digg it
reddit Save to Reddit
facebook Share on Facebook
twitter Share on Twitter
aolfav More bookmarks


Previous post: Envy
Next post:RSS feed processing with AWK


About Jim ...


Click **here**
to try out MailWrench;
a command-line SMTP /
SMTPS (Google Gmail)
mailer for Windows.


Follow me on Twitter

http://twitter.com/lawlessGuy


Recent Posts

A JavaScript REPL for Android Devices

MailSend is Free

My Blog Engine

The October 10th Bug

A Review of Kevin Mitnick's Book Ghost in the Wires

Spellbound by Web Programming

Backlinks to my Blog Posts

Play MP3 Files with Python on Windows


Random Posts

Compiling Rhino JavaScript to Java

Throwaway Software: HangUp

A Lightweight Alternative to Windows Shortcuts

Java in a Windows EXE with launch4j

Taking Shape

Blogoversary

A Command-Line CD Tray Opener

An Interview with Game Developer James Hague

Auto Save Clipboard Images Redux

A Command-Line MP3 Player for Windows


Full List of Posts

http://www.mailsend-online.com/bloglist.htm


Recent Posts from my Other Blog

Remembering Dr. San Guinary

Why Some Web Sites will go Dark on Jan 18th

SNL Superhero Skit

More Ruby Games

My Ruby Game Challenge Entry

Steal this Bookmarklet

Nerd Toys

Learn New Jargon, You Must

Spot the Wiebe

Tech Magazine Glory Days

Book Review : Paull Allen - Idea Man

A 90's Experiment in Online Systems - The U.S. West CommunityLink Service