PDF history and something special from Adobe

Part One: PDF history 

PDF is a formal open standard, ISO 32000. It was invented by Adobe Systems 17 years ago.

PDF = Portable Document Format

PDF history by Adobe

History of the PDF by Adobe Systems

The image links to a pleasant interactive timeline of Adobe Systems and its role in the development of the PDF. The chronology is in Flash, and thankfully free of any video or audio. Read more about Adobe Systems role in the history of PDF file development.

PDF files are more versatile than I realized, and

  • are viewable and printable on Windows®, Mac OS, and mobile platforms e.g. Android™
  • can be digitally signed
  • preserve source file information — text, drawings, video, 3D, maps, full-color graphics, photos — regardless of the application used to create them

Additional PDF file types exist, including PDF/A, PDF/E and U3D. All are supported by Adobe software.  (more…)

Published in: on September 5, 2011 at 7:30 pm  Comments (3)  
Tags: , , , , , ,

Power law relationship in modern demographics

Cognition seems to be the driver behind a power law relationship, which would be odd indeed. It implies a fixed way of thinking about geography and places that can be modeled statistically. Human thought processes aren’t generally amenable to quantitative models.

Is this something new?

curious relationship

Toponyms

Giving a name to a place is an important act. It says a place has meaning, that it should be remembered. For thousands of years, the way we kept track of place names—or toponyms—was by using our memory. Today, we’re not nearly so limited, and the number of toponyms seems to have exploded. Yet oddly enough, the number of places we name in a given area follows a trend uncannily similar to one seen in hunter-gatherer societies.…

via Per Square Mile
Next steps?

  1. Confirm if Eugene Hunn’s 1994 findings were reproduced with current data
  2. Check whether the USPS zip code information used was correct

Basic data visualizaton

Simple data viz

Internet users by country in 2010

This is the first of five graphics in a series, State of the Internet 2010. All are hand-made graphics by Jose Duarte. He is exploring new and simple ways to represent information. With his handmade visualization tool-kit, he provides the technology to rapidly create any kind of graphics including

abstracts maps and diagrams, area graphs and charts, arrow diagrams, bar graphs, Venn diagrams, time line charts, bubble graphs, circle diagrams, proportional charts, organization charts, and really, whatever you want.

Do you want your own kit? Follow the link embedded above, and follow the instructions. It can be yours, free of charge, no-strings-attached. Just send an email to Jose Duarte as instructed in the text accompanying the “handmade visualization tool-kit” link.

Published in: on August 8, 2011 at 9:42 am  Leave a Comment  
Tags: , , ,

Twitter Influence System

Twitter flow chart

Twitter influence is subtle and difficult to capture

Twitter influence ranges beyond measuring followers, @ replies and re-tweets. It isn’t trivial to calculate the true reach of an individual’s Twitter updates. Such are the challenges encountered in quantifying influence (perhaps even value) of Twitter users’ activity.

Percentage of Tweets Read

Actual percentage of Twitter content read

This chart shows the percentage of tweets read in relation to the number of people followed. As could be expected, the more people you follow, the smaller the percentage of tweets you actually read.

Both images, Twitter Influence EcoSystem and Percentage of Tweets Read are original work by John V Lane, via Flickr, and reproduced here under Creative Commons License/by-sa/2.0.

Published in: on July 8, 2011 at 8:05 pm  Comments (3)  
Tags: , ,

Zanran is a new data search engine

Something new and different in search has appeared.

Zanran is an internet start-up company that hails from somewhere other than Mountain View or Sunnyvale, California. Nor is it in “Silicon Valley East”, the new incubator of technology ventures otherwise known as the Borough of Manhattan. Zanran is farther than farthest Fishkill, across a span greater than even the Tappan-Zee can bridge. Zanran is a U.K. domiciled company in Islington, London.

Not a Google Universal Search 2.0 competitor

Zanran seems to be more of a database searching tool. It would probably be best classified as a specialized search engine.

Zanran Data Search

Zanran Search Beta screen shot

Zanran’s search method is described as patented but based on open-source programs. The actual patent, which I only glanced at, A Method and System of Indexing Historical Data, should help in clarifying. Zanran distinguishes itself because it is particularly well-suited to web search for information that has embedded numerical or graphical data:

Zanran helps you to find ‘semi-structured’ data on the web… numerical data e.g. a graph in a PDF report, or a table in an Excel spreadsheet, or a bar chart shown as an image in an HTML page. This huge amount of information can be difficult to find using conventional search engines, which are focused primarily on finding text… Put more simply: Zanran is Google for data.

Zanran is not a search engine with obvious uses in text or sentiment analysis. The beta website has a long page of examples demonstrating the speed (fast), breadth (across a very diverse assortment of scientific and analytic use cases) and quality of results.

Arthur Weiss, a competitive analyst and former long-time employee of Dun & Bradstreet UK, did a very thorough review of Zanran Search (April 2011):

I’ve been playing with a new data search engine called Zanran… The site is in an early beta. Nevertheless my initial tests brought up material that would only have been found using an advanced search on Google – if you were lucky. As such, Zanran promises to be a great addition for advanced data searching.

Zanran enters the marketplace

Zanran appears to have retained Mallard Digital Marketing. Mallard Digital’s hallmarks are “Authenticity, Transparency and Engagement”. Mallard features an attractive duck in the company logo, and in this rather engaging 15-second video. I base my conjecture about Mallard and Zanran upon three pieces of evidence:

  1. Mallard’s recent announcement, about the acquisition of a search engine as a new client on 29 March 2011
  2. The fact that Mallard likes Zanran and Zanran likes Mallard on the Facebook pages of each company
  3. The Zanran company dog enjoyed playing with Mallard’s Labrador retriever in March 2011 (also via Facebook)

Analogy and Digression: SHODAN

As a very general analogy, Zanran functionality reminds me of Google Code Search or SHODAN computer search. SHODAN is a search engine that can be used to:

find specific computers (routers, servers, etc.) … [it is] a search engine of banners. Google and Bing are great for finding websites. But what if you’re interested in finding computers running a certain piece of software (such as Apache)?  Maybe a new vulnerability came out and you want to see how many hosts it could infect?

Here’s a screen shot of the main query page:

SHODAN specialized search

SHODAN screen shot

I am impressed to no end with SHODAN. It is quite clever, and remains very low profile, much like my blog.

UPDATE

I drafted this on 12 May 2011 but failed to actually post due to my insatiable need to excessively fuss and play with WordPress functionality. In the interim, others (most notably Search Engine Journal) have also found the subject of the following post, the Zanran data search engine. I mention this not as self-promotion, but rather, to emphasize that Zanran may be of greater significance than my casual tone indicates.

Published in: on June 21, 2011 at 11:20 am  Comments (5)  
Tags: , , , ,

Risk perception and reality

This is an excerpt, selected by Moi, from the article Risk perception, a recent post that appeared on the Soapbox Science Blog, Nature Publishing Group.

Symbol of radiation hazard

Universal symbol of radiation and fear. Image via Wikipedia

Sometimes, no matter how right our perceptions feel, we get risk wrong. We worry about some things more than the evidence warrants (vaccines, nuclear radiation, genetically modified food), and less about some threats than the evidence warns (climate change, obesity, using mobile phones when we drive). That produces a Perception Gap, the gap between our fears and the facts.

The Perception Gap produces dangerous personal choices that hurt us and those around us (declining vaccination rates are fueling the resurgence of nearly eradicated diseases). It causes the harm to health of chronic stress (for those who worry more than necessary). And it produces social policies that protect us more from what we’re afraid of than from what in fact threatens us the most (we spend more to protect ourselves from terrorism than heart disease)… which in effect raises our overall risk.

We do have to fear fear itself…too much or too little. So we need to understand how our subjective risk perception works, in order to recognize and avoid its pitfalls.

Here was the take-away for me: Societal risk management has to recognize the risk of risk misperception–  recognizing the risk that arises when our fears don’t match the evidence. This is truly the risk of The Perception Gap. It has always been relevant, and becomes so once again in light of the recent E-coli outbreak in northern Europe. The Guardian UK used that as a starting point for a well-written and up-to-date article about the hazards of risk misperception and the consequences of irrational behavior.

Kahneman and Tversky did extensive research on this topic. I am not concerned whether articles like the one referenced above are derivative, in the sense of revisiting past work. Possibly it is an application in the context of current events. Or it may be entirely original new work. My concern is solely that there is an awareness of the reality, and that it be acted upon.

Protected: You May Need This One Day

This post is password protected. To view it please enter your password below:

Published in: on June 13, 2011 at 8:21 am  Enter your password to view comments.  
Tags: ,

Reduce desktop clutter

Published in: on May 17, 2011 at 12:02 am  Comments (1)  
Tags:

Fault testing in the cloud

The impact of an Amazon Web Services* Elastic Cloud 2 outage is being determined at this very moment. It will be interesting to do some post-situation analysis, and see what the effect was on global web traffic. That is not possible at this time, as EC2 remains off-line.

Many sites are unaffected of course.

Happily, a single cloud provider has not become indispensable for the internet. This should reinforce the viewpoint that alternative provider services, at least two or three, are always to be encouraged. The BBC provides a fine summary of the situation.

*This is the product more commonly known as AWS EC2.

Amazon fault takes down websites

21 April 2011 Scores of well-known websites have been unavailable for large parts of Thursday because of problems with Amazon’s web hosting service. Foursquare, Reddit and Quora were among the sites taken offline by the glitch. No reason has so far been given for the outage.

Quora website
Amazon’s cloud service last hit the headlines when it decided to stop hosting a mirrored version of the Wikileaks website. However, at this stage, there is nothing to suggest that the most recent outage was related to the Wikileaks controversy.
Read more at www.bbc.co.uk
Published in: on April 21, 2011 at 9:55 am  Comments (4)  
Tags: , , , , , , , ,

Radiation levels in Japan and the U.S.A.

UPDATE 13 April 2011: All links work in Part 1. Added a Part 2 for U.S, European radiation levels

Part One: Radiation levels in Japan

The source for this chart is Ryugo Hayano, Ph.D. Professor Hayano is the Physics Department Chair at The University of Tokyo. Click on the image to view a larger version, with higher resolution. It links directly to the Professor’s user page on image-sharing site Plixi. You’ll find many other charts and graphs there. Some charts are localized at a prefecture level.

Graph of Radiation

Graph of Radiation levels in Japan on 10 April 2011

I offer my thanks to @hayano and Daniel Garcia. Daniel R. Garcia Ph.D. is a nuclear scientist from France, doing a post-doc at TEPCO, in Fukushima. He was there prior to the earthquake and tsunami. Daniel frequently sends updates via Twitter as @daniel_garcia_r. He works at the reactor site every day, takes photos, and makes them available via Twitter.

Fukushima nuclear plant

Control board of Fukushima 1 Nuclear Power Plant when all was well

Both Daniel and Professor Hayano are reliable, because they never confuse Becquerel with Sievert with Roentgen, they know radio-isotopes and their half-lives better than nearly anyone. Daniel had to assisted the press a few weeks ago when there was confusion between Cesium 137 versus Iodine 137 versus Iodine 131 versus Uranium 137.

PART II: Other locales, other radiation levels

The Radiation Network is an excellent resource for radiation information in the U.S.A. and other parts of the world. It is a network of civilian volunteers using a protocol to report radiation readings, 24 hours a day, 7 days a week. Sensor stations are located throughout the contiguous 48 states, Hawaii,  Alaska and Norway. There was one in Northern Japan. Sadly, it went off-line last month.

The Radiation Network is non-profit, all volunteer and headquartered in Arizona. Tim is the public face of the Radiation Network. Using software developed for this purpose, Tim collects and aggregates the real-time data from the sensor stations, then updates the map online with the readings at one-minute intervals. The Radiation Network has went online nearly a decade, ago. Thus they offer very reliable baseline measures for comparison and detection of any incident. Their criteria for elevated radiation include

  • Rule out protocol for false positives e.g. spikes due to sensors  malfunctioning,
  • Level of radiation that is significant: Higher than the threshold AND sustained, and how long “sustained” is,
  • Exogenous causes such as geography. Readings in Colorado are always higher due to the higher elevation.

The site is basic but  functional. There are The Maps, and The Message. The Message is a running log of updates.

In addition to the embedded links above, you can read a little more about the Radiation Network in this little piece I wrote on Amplify on April 7.

Published in: on April 10, 2011 at 8:59 am  Leave a Comment  
Tags: , , , , , , , ,
Follow

Get every new post delivered to your Inbox.

Join 31 other followers