opensource

Bookmark Scrubber

THIS PROJECT HAS BEEN MOVED TO GITHUB

https://github.com/nickshin/bookmark_tools

This page has been left here for historical purposes...


Download

View the (NEW) bookmark scrubber Python script and the implementation document here.
Download the (OLD) bookmark scrubber Javascript project.


This project originally started out as a backup and restore utility. Some may ask why is something like this needed when there are already some that exists out there.

FEBE (Firefox Environment Backup Extension) is one such utility and is my most favorite of these types.

Xmarks (formerly FoxMarks) is also an interesting one.

But all of these didn't solve a few problems I was having.

Issue #1: sync

On my day to day operations, I work on a lot of different computers. A few at the office, a few at my home, portables, etc. Every now and then, my bookmarks on one computer doesn't match those on my other ones. And it gets to be a real pain in the @$$ to try and re-sync them. (Firefox sync and sync tabs has come a long way providing this exact feature -- and I really like it.)

The bookmarks, either in the HTML format or the JSON file (that's smashed in to one line), contains extra data that I don't really need. And, these make the diff proggy fill up the screen with a pile of cruft.

However, there is something I do need that's inlined with the bookmark entry in addition to the description data that's kept on a separate line:

Diff with false postive changes (in red) with interested valid ones (in blue).

Issue #2: scrub

Storing the bookmarks online sounded fun, but I wanted to do it with one caveat. Xmarks has private features as well as hosting bookmarks on your own servers (this goes the same for Firefox Sync). But, there are entries that I do not want published anywhere.

Say I would like to use the social features available for these online bookmark features. But... I work with a lot of clients and accumulate a lot of internal links that shouldn't be seen by outsiders. Private features or not, a shared remote hosting site is still a non-trusted location. Intranet links describes network architectures that are very useful for hackers or others with nefarious purposes.

Therefore, I needed a way to scrub my bookmarks. One way is to allow bookmark item be tagged within the Bookmark Organizer and a public-friendly version can be exported. Some items I filter out includes:

Bookmarks with personal folders that I do not wish to be made public.

In a nutt shell

The ability to sync bookmark data a little less painful meant sticking with a human readable JSON data format. This still allows for rapid backup and restoration. And, this allows me to use basic revision control systems to track which bookmarks were deleted, moved or edited.

Diff of human readable streamlined JSON dataset.


A project using the Bookmark Scrubber

An example of using the generated bookmark scrubber output would be my bookmarks available online.


Releases

version 1.0

Python Script
Contains the functions jsonbkmks() and json2html() all written in Python. The code can be compared against the JavaScript version for reference. The Python script runs much faster than the JavaScript version.

versions pre-1.0

Note: these are the JavaScript versions
jsonbkmks
Extracting the bare essential JSON entries needed for the Bookmark Organizer's restore feature to still work. The streamlined data is formated to be diff-edit-merge friendly.
json2html
This generates the public (scrubbed) and private (unscrubbed) HTML bookmark versions.
crypto
Encryption and decryption examples. Useful when archiving private (unscrubbed) data to a shared hosted site.


Change List

v 1.0
Python version of the bookmark scrubber project
v 0.4
JSONBKMKS: generate the scrubbed version of the streamlined JSON format
v 0.3
JSONBKMKS: generate bookmarks in a streamlined JSON format that is diff-edit-merge friendly
v 0.2
CRYPTO examples
v 0.1
JSON2HTML: generate public + private bookmarks in the HTML format


Copyright © 2009-2010 by ESTSS. All Rights Reserved.