Getting started with Gephi

For this post, I’m going to provide step-by-step instruction for those of you interested in creating network graphs using Gephi.  Certainly there is other open-source software available for visualizing social network and textual data such as Pajek (this website could use a serious design update), but at the time of this post, Gephi 0.8.2-beta has some significant advantages.  Software such as Pajek allows you to save your project file as a .bmp, .png, or .svg, but Gephi allows you to save your graph image as a .pdf.

Additionally, Gephi’s most significant advantage over the competition comes from the inclusion of the sigma.js plugin, which uses the HTML canvas element to display static graphs like those generated in Gephi.  This is a massive leap forward for sharing graphs generated in Gephi, as now they can be uploaded directly to your server/website using an FTP file manager such as FileZilla.  To interact with the graph rather than view it as a static image used to require downloading the specific, proprietary program file containing the graph from its creator, then downloading the specific software to open the file.  However, with the sigma.js plugin, interactive graphs can be displayed and shared instantly via a simple web address.

To begin the process of creating and sharing your own network graph using Gephi, I’ll break the process into a series of simple steps.  I think these instructions will be useful to those of you starting out, as a simple, step-by-step “Gephi for Dummies” manual simply doesn’t exist at the time of this post, something that I wish I had when I first started working with the software.  Gephi has a “Quick Start” guide here which is worth a look, but it leaves much to be desired as a basic guide for a novice user.  The following instructions which I’ve created owe much to the wisdom and experience of Jason Heppler and Rebecca Wingo, graduate school colleagues who provided a lot of assistance during my trial and error process of figuring out Gephi’s software.

1. Download and install Gephi from their website.

2. To download and install the Sigma.js plugin, open Gephi.  Click on the “Tools” tab and click “plugins” from the drop-down menu.

3. Click on the “Available Plugins” tab and scroll down nearly to the bottom of the list to find the “Sigma Exporter” plugin.

4. Click the check box next to the “Sigma Exporter” plugin, then click the “Install” button at the bottom left corner of the window.  (See screenshot below.)

sigma screenshot

5. Once the plugin is downloaded and installed, close and re-open Gephi to complete the plugin installation.

6. Now it’s time to format your data for importation into Gephi. Using Microsoft Excel, create a two column data set.  The specific format for the data needs to be divided into one column as SOURCE and the second column as TARGET.  (See example screenshot below).

excel screenshot

7. The SOURCE column on the left determines the number of nodes your graph will contain and the TARGET column will determine the number of edges (or connections) between nodes.  Repeat the node-edge/source-target pattern in these two columns for each connection between nodes you wish to visualize.

8. Once you’ve entered in (or hopefully imported) your data and saved it as a .csv file, you’re ready to import the file into Gephi.

9. Open Gephi, click the “File” tab, then click “Open” from the drop-down menu.  Browse for your .csv file, and click the open button at the bottom of the window.

10. This will open the “Import Report” window.  Make sure the “Create Missing Nodes” box is clicked, then hit OK.  (See screenshot below.)

import screenshot

11. To label your nodes in the graph, click on the “Data Laboratory” tab.

12. At the bottom of the screen, click the “Copy data to other column” button, then select “ID” from the drop-down menu.  (See screenshot below.)

labelid screenshot

13. In the pop-up box, select “Label,” then OK. (See screenshot below.)

label screenshot

14. Now click on the “Overview” tab to tinker with your graph’s spatial/visual layout.

15. Click on the “Choose a layout” tab on the lower-left part of the window to determine how you want to display your nodes and edges.  A popular choice is either of the “ForceAtlas” templates, but I’d recommend tweaking the value of the “gravity” input to expand/contract the spread of your nodes to your liking.

16. At this point, you should be able to see your network graph on the “Overview” page and can choose to export your graph as a .pdf or as a Sigma.js template.  Click the “File” tab, scroll down to “Export” and select your preferred format for exportation.  If you choose to export using the Sigma.js template, Gephi will create a folder containing files ready to be uploaded to your server/webpage.  If you want to view the graph in a browser, click on the “index” file in the folder.

17. (ADVANCED) For those of you who wish to further tweak your visualization, you can customize your node colors and sizes by linking them to particular data attributes.

18. This is accomplished using the statistical analysis options on the right side of the “Overview” interface. (See screenshot below.)

analytics screenshot

19. To color code your nodes by cluster and to emphasize the significance of particular nodes via node size, you’ll need to run at least one of several statistical analyses, which are located on the right side of the screen.

20. I found that running “Avg. Path Length” under the “Edge Overview” tab to be particularly helpful in visualizing relationships between nodes.  This statistical analysis generates new metrics such as “betweenness centrality,” which, when tied to node size, visually emphasizes the more significant nodes in the graph.  (See screenshot below.)

betweeness screenshot

21. To link “betweenness centrality” to node size, click the red jewel icon on the upper-left-hand side which selects the size/weight attribute for each node, then click the “Choose a rank parameter” tab on the upper-left side and select “betweenness centrality” as the attribute you wish to link to size/weight.  (See screenshot below.)

sizeweight screenshot

22. You can link any of the attributes from the “Choose a rank parameter” tab to node color, size/weight, and labels, so there’s a great deal of customization options available to you.

23. Once you’ve established the link between your nodes and your attribute of choice, you can adjust the min/max size and color of your nodes to further customize your graph.

24.  This is as far as I’ve gone with Gephi, so I’ll end here with a couple final troubleshooting tips after you’ve exported using the Sigma.js template.

TROUBLESHOOTING:
One issue I had during my first project was that in Overview, my labels and custom node sizes showed up fine, but upon exporting the graph, all the nodes were the same, tiny size and had no labels.

To fix this particular issue and/or to customize which nodes are labelled, you’ll need to open the folder created by the Sigma.js template and then open the “config.json” file with Wordpad or an .xml editor.  (See screenshot below.)

config screenshot

Once you’ve got the file open, scroll down to the lines “maxNodeSize” and “minNodeSize,” change the values until your nodes are large enough, save the file, and re-open the “index” file in your browser.  (See screenshot below.)

wordpad screenshot

To fix the label issue, lower the “labelThreshold” value in the config.json file to assign labels only to nodes equal to or above a certain weight.  Again, save the file once these changes are made, and re-open your “index” file in your browser to view the results.

My first project using Gephi resulted in a network graph created using 33 nodes and 213 edges.  This was just a test run using the names of several prominent political figures from twentieth-century Mexican history, so my data set doesn’t actually analyze anything (however, I have plans, big plans for the near future.)

photo screenshot

This screenshot is equivalent to the level of interactivity you can gain from viewing the graph as a .pdf, a static image.  However, with the aid of the Gephi.js plugin, you can view a fully interactive version of the graph here, complete with clickable nodes containing a variety of attribute data.

I hope you find this tutorial useful, and I look forward to seeing your future projects using Gephi.  Stay tuned for more mapping and network graph projects I’m churning out this fall, and feel free to contact me at r.jordan@colostate.edu if you have any questions.

Advertisements

6 Comments

Filed under data visualization, digital humanities, gephi, history, mexican history, network graph, textual analysis

6 responses to “Getting started with Gephi

  1. Great post! Looking forward to try it!

  2. Pingback: Gephi – curated list of tutorials | Insights @exploreyourdata

  3. Pingback: yEd Graph Editor | Lost Packets

  4. Pingback: Introducing Gephi | Lost Packets

  5. Kola

    Great post. This is most helpful and really a-step-by-step ‘gelphi for dummies’ post. Thanks

  6. Pingback: Playing Inside in the Lab on a Very Cold Day | Shaffer Lab

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s