view README.txt @ 41:194ea1428156

from http://k0s.org/portfolio/directed-graph.txt
author Jeff Hammel <jhammel@mozilla.com>
date Thu, 11 Apr 2013 15:31:38 -0700
parents 00aee7b82657
children 1cc89badc5c6
line wrap: on
line source

WSGraph
===========

WSGI graph server


Directed Graphs
===============

Storage
-------

A directed graph may be expressed as JSON::

 {'nodes': {'node1': {'metadata1': value, ...},
            'node2': {'metadata1': value, ...}}.
  'edges': {'node1': {'node2': {'metadata': value}, ...},
            ...},
  'graphmetadata': value,
  ...}

In languages like python, which can hash (key, value) pairs, the edges
may be more conveniently hashed::

 edges[('foo', 'bar')] = {'metadata': value}

A directed graph may be stored in a RDB::

Table: *nodes*

+----+---------+---------+---+
|node|metadata1|metadata2|...|
+====+=========+=========+===+
|foo |value    |value    |   |
+----+---------+---------+---+
|bar |value    |value    |   |
+----+---------+---------+---+

Table: *edges*

+-----+-----+---------+---------+---+
|node1|node2|metadata1|metadata2|...|
+=====+=====+=========+=========+===+
|foo  |bar  |value    |value    |   |
+-----+-----+---------+---------+---+

Note that the nodes table has unique keys, but the edges table has
rows identified only by the ordered set of two keys: node1 and node2.

Previous formats use a node and value hash of a string: the nodes have
a given name.  While this is convenient for human identification, a
more efficient hashing mechanism may be achieved by an algorithm.
Since a edge is a unique hash of two nodes, the nodes may be numbered
with an appropriate algorithm giving a unique hash per edge.


Higher Dimensionality
---------------------

* faces
* ...


Links
-----

* http://www.graphdracula.net/
* http://thejit.org/

Ideas
-----

I work at automation and testing at Mozilla, but have been known to
don multiple hats, and, not surprisingly, I think a lot about how the
web works. So I just read Stuart's blog post and couldn't help but
feel that it plays somewhat in to my idea of mine.

Background....

The web is a directed graph (
http://en.wikipedia.org/wiki/Directed_graph ). The nodes of the
directed graphs are web pages and the edges of the directed graphs are
produced when you clicked on a link.  This gives a unique opportunity
to provide maps of the web.  There are a few components in this
visualization.

A site map.  This is a map that sits on a server.  When a visitor
clicks on a link, a new edge is added by reading HTTP REFERER.  A toy
example, no longer "live", is at http://k0s.org/map.svg . This gives
both the webmaster and visitors to see how the site is traveled: the
well worn routes as well as the dusty trails.  Note that the map
itself is an element on the map :) In addition, the map could be
embedded as a navigation aid on any page that desires it.  Obviously a
lot more could be done here than is shown in the toy example: nodes
could be collapsed into metanodes until expanded (that is
/pictures/mission/capp.jpg and /pictures/mission/cowboy.jpg might just
be /pictures/mission for the sake of easy navigation unless a user
clicked in for closer detail), etc.  The algorithm is really simple:
 - for each request (in middleware) read the path info and the HTTP
 referer
 - if this edge doesn't exist create it
 - if it does, add one to its count (and whatever other metadata you
 want....you have the whole request)
 - additionally, you need a handler to display this data
Some sites do this though I rarely see it presented this way and even
more rarely to the user.
In addition, you can make the system federated.  For instance, lets
say foo.org and bar.org both have maps (presumedly noted as <link
rel="" type=""> in their <HEAD>s).  If there is traffic from foo to
bar you can stitch these maps together at the points where they meet.
You have a federated map of the web!

The map of one's journey.  So the above is from the site's point of
view.  What about the user's point of view?  It would be easy to make
an addon that recorded your own map of how you traversed the web.  You
click on a link in any tab and, again, an edge is created (and
populated with whatever metadata you want). You can see how you use
the web as well as share your travels with friends.  This could also
link up with site maps when they are available.

Just some things I've been thinking of in the last two years.  I'm
happy to go into more detail or flush them out if there is interest.

https://groups.google.com/forum/?hl=en#!topic/mozilla-labs-pancake/TbRFpxvic6M
http://play.fxhome.mozillalabs.com:4322/lattice/oyiptong@mozilla.com/graph/vis.html



Graph exchange formats:

* http://en.wikipedia.org/wiki/Graph_Modelling_Language
* http://en.wikipedia.org/wiki/GraphML
* http://en.wikipedia.org/wiki/Trivial_Graph_Format
* http://www.gupro.de/GXL/ , http://en.wikipedia.org/wiki/GXL
* http://en.wikipedia.org/wiki/Dot_language (Graphviz)


Python packages related to graphs:

* graph-tool: http://en.wikipedia.org/wiki/Graph-tool ,
  https://pypi.python.org/pypi/graph-tool


External libraries and tools

* Graphviz
* tulip: http://tulip.labri.fr/TulipDrupal/

--

* see also http://k0s.org/links/?q=graph

Parallel Efforts
----------------

* http://k0s.org/hg/IntentMadeManifest/



----

Jeff Hammel

http://k0s.org/hg/WSGraph