Semi-automatically reference the source when taking notes from a PDF

Problem: Often when taking notes from a PDF, or when noting down a point of interest from a PDF, it’s useful to record the precise source (article and page number).  If you do this a lot, it can get tedious writing these things out.

When using Microsoft Word, it is possible to insert a proper bibliographical reference using your chosen reference manager (e.g. Zotero).  However, for daily note taking this can be a bind.  First, in the case of Zotero, the Insert Reference dialog can take a while to load.  Second, although you end up with a properly linked, automatically formatted reference, it doesn’t solve the problem of having to manually enter this information every time: using the Zotero plugin for Word it takes at least 8 clicks, plus some typing, to have a reference inserted with a page number.  In addition, you can’t use such a method anywhere that only accepts plain text or HTML, such as Zotero’s own notes, for example.

Solution: This AutoHotKey script allows you to press a key combination (in my case it’s on Ctrl-Win-R) when viewing some PDF and have the following put onto the clipboard: Author (article name, page number).

This script will work best if you use a consistent file naming policy.  I have all my books and articles named in the following format: author1-author2---article-name.pdf.  You can adjust the script to suit your policy.  You’ll note that the page number comes through as “fp.N”, rather than “p.N”.  This is because what is grabbed is the page number of the PDF file, rather than the original page number as it appears imprinted in the PDF file (the page number of the journal).  It turns out that extracting random bits of text from PDF files (like page numbers in the header) is surprisingly difficult, but this solves the problem at least partially.

The script works with PDF XChange version 2.5, and Foxit PDF Reader version 5.3.  To adapt or add other readers, you’d need to run AutoHotKey’s WindowSpy program, and find the relevant control from which to grab the page number.

AutoHotKey script:

AutoTrim, On
SetTitleMatchMode, slow
WinGetActiveStats, WindowTitle, Width, Height, X, Y

DocumentName := SubStr(WindowTitle, 1, InStr(WindowTitle, " - "))
DocumentName := RegExReplace(DocumentName, "-", " ")
DocumentAuthors := SubStr(DocumentName, 1, (InStr(DocumentName, " ") - 1))
DocumentName := SubStr(DocumentName, (InStr(DocumentName, " ") + 3))

StringUpper, DocumentAuthors, DocumentAuthors, T
DocumentAuthors := RegExReplace(DocumentAuthors, " ", " & ")

StringUpper, DocumentName, DocumentName, T
DocumentName = %DocumentName% ; this runs AutoTrim on this variable

if InStr(WindowTitle, "PDF-XChange Viewer") then
    controlTitle := "DSUI:CmdEdit1"
    ControlGetText, pageNumber, %controlTitle%, ahk_class DSUI:PDFXCViewer
if InStr(WindowTitle, "Foxit Reader") then
    controlTitle := "RichEdit20W1"
    ControlGetText, pageNumber, %controlTitle%, ahk_class classFoxitReader

clipboard = %DocumentAuthors% (%DocumentName%, fp.%pageNumber%)


Example output:

Hebblethwaite (Philosophical Theology And Christian Doctrine, fp.157)