The Early Modern Commons

Search Results for "collection-building"

Your search for posts with tags containing collection-building found 7 posts

Three nasty problems.

Some distant-reading problems require thankless work in the dead of night. Continue reading →
From: The Stone and The Shell on 20 Feb 2016

A dataset for distant-reading literature in English, 1700-1922.

Literary critics have been having a speculative conversation about close and distant reading. It might be premature to call it a debate. A “debate” is normally a situation where people are free to choose between two paths. “Should I...
From: The Stone and The Shell on 7 Aug 2015

How to find English-language fiction, poetry, and drama in HathiTrust.

Finding works in a particular genre may still be the hardest part of distant reading. Here's a page-level map of 854,000 books that may help. Continue reading →
From: The Stone and The Shell on 29 Dec 2014

Distant reading and the blurry edges of genre.

There are basically two different ways to build collections for distant reading. You can build up collections of specific genres, selecting volumes one by one. Or you can take an entire digital library as your base collection, and subdivide it …...
From: The Stone and The Shell on 23 Oct 2014

A half-decent OCR normalizer for English texts after 1700.

Perhaps not the most inspiring title. But the words are carefully chosen. Basically, I’m sharing the code I use to correct OCR in my own research. I’ve shared parts of this before, but this is the first time I’ve made … Continue...
From: The Stone and The Shell on 10 Dec 2013

Problems of scale.

Just a quick note here to acknowledge a collaborative project that I hope will generate some useful resources for scholars interested in text mining. We don’t have many resources up on the website yet, but watch this space. The project … Continue...
From: The Stone and the Shell on 20 Oct 2012

Where to start with text mining.

This post is less a coherent argument than an outline of discussion topics I’m proposing for a workshop at NASSR2012 (a conference of Romanticists). But I’m putting this on the blog since some of the links might be useful for … Continue...
From: The Stone and the Shell on 15 Aug 2012

Notes on Post Tags Search

By default, this searches for any categories containing your search term: eg, Tudor will also find Tudors, Tudor History, etc. Check the 'exact' box to restrict searching to categories exactly matching your search. All searches are case-insensitive.

This is a search for tags/categories assigned to blog posts by their authors. The terminology used for post tags varies across different blog platforms, but WordPress tags and categories, Blogspot labels, and Tumblr tags are all included.

This search feature has a number of purposes:

1. to give site users improved access to the content EMC has been aggregating since August 2012, so they can look for bloggers posting on topics they're interested in, explore what's happening in the early modern blogosphere, and so on.

2. to facilitate and encourage the proactive use of post categories/tags by groups of bloggers with shared interests. All searches can be bookmarked for reference, making it possible to create useful resources of blogging about specific news, topics, conferences, etc, in a similar fashion to Twitter hashtags. Bloggers could agree on a shared tag for posts, or an event organiser could announce one in advance, as is often done with Twitter hashtags.

Caveats and Work in Progress

This does not search post content, and it will not find any informal keywords/hashtags within the body of posts.

If EMC doesn't find any <category> tags for a post in the RSS feed it is classified as uncategorized. These and any <category> 'uncategorized' from the feed are omitted from search results. (It should always be borne in mind that some bloggers never use any kind of category or tag at all.)

This will not be a 'real time' search, although EMC updates content every few hours so it's never very far behind events.

The search is at present quite basic and limited. I plan to add a number of more sophisticated features in the future including the ability to filter by blog tags and by dates. I may also introduce RSS feeds for search queries at some point.

Constructing Search Query URLs

If you'd like to use an event tag, it's possible to work out in advance what the URL will be, without needing to visit EMC and run the search manually (though you might be advised to check it works!). But you'll need to use URL encoding as appropriate for any spaces or punctuation in the tag (so it might be a good idea to avoid them).

This is the basic structure:

http://emc.historycarnival.org/searchcat?s={search term or phrase}

For example, the URL for a simple search for categories containing London:

http://emc.historycarnival.org/searchcat?s=london

The URL for a search for the exact category Gunpowder Plot:

http://emc.historycarnival.org/searchcat?s=Gunpowder%20Plot&exact=on

In this more complex URL, %20 is the URL encoding for a space between words and &exact=on adds the exact category requirement.

I'll do my best to ensure that the basic URL construction (searchcat?s=...) is stable and persistent as long as the site is around.