Searching with Solr
How to GITify Your Changes: The Easy Peasy Lemon Squeezy Way
Microdata: Making Metadata Matter for Machines
Data Quality in Evergreen
How to Choose the Right Hardware for PostgreSQL and Evergreen
There and Back Again, Again
Searching with Solr aka “Is Solr rad? Solr is soooo rad”
David Busby, assoc. w/ King County Library System
- Showing unfunded project w/r/t searching libraries.
- Solr search -Not for the librarians to search with, more for patrons.
- For patrons prefer Google-like all-in-one search engine
- Search engine allows for search results and then direct link to OPAC record.
- Ideal goal: integration of Solr search engine with search box infrastructure of Evergreen.
- Browse by Facets demo’d,
- (Observational comment: so awesome, so fast! :) )
- Asked for examples of Solr + Evergreen in crowd, one response (see video).
- Each collection has own schema
- Used copy field & then thrown into type/facet fields.
- Use a simple REST API query
- Schema is not attached to any branch, call numbers for each record dumped into an array. Easier for manipulation/display?
- Test database is based on Project Gutenberg info. Looking for a library willing to contribute their records.
- EG “indexes”/searches with Postgres, performance is not exactly Google-fast. Solr might be optimal.
- Solr can tokenize & normalize data and doesn’t require a lot of coding to be able to implement in comparison to other available .
- Also, documentation in Solr is robust.
- Solr and Evergreen are both open source - why don’t we work together?
- Q: Solr is very fast, yes. How might we reconcile EG search being so slow due to its searching permissions on top of item search, i.e. items in my library system.
A: Anyone with a good idea of how to fix call numbers & branches issue with being properly limited to the right branch, etc.: Let us know!
- Q: Regarding dynamic nature of Solr, if I add a new record would I have to do a mass re-index?
A: Update necessary. If status of item changes, EG knows but Solr won’t know immediately.
- Q: How does Solr stream changes?
A: Via a trigger (available on GitHub), trying to avoid inserting into busy table by leaning on UDP packets for the update.
- Solr is not touching Evergreen database for its search/browse, direct link points to Evergreen database.
- Evergreen search relies heavily on dropdown windows, wouldn’t it be great to integrate to be able to search by branch?
- How can solr get so close to real-time Evergreen data?
- Q: Re: Trigger - do you know when it will be added to the distribution?
A: Doubt someone will.
- Q: Have you looked at VuFind?
A: Glanced at it, but haven’t looked at it. Will it be our community building future?
A: Doesn’t work with the scale that libraries might need. Trigger will work with API services like Summon, etc. so that you can have a multichannel approach. Suggest that you look at that before progressing further.
Comments: Finnish national library use VuFind and the Dutch has implemented VuFind OAI-PMH (Open Archives Initiative - Protocol for Metadata Harvesting)
- Q by Busby: So is Solr not relevant?
A1:No, it would be great if Solr could be implemented as a part of Evergreen b/c EG search needs to be improved.
Note: Patch that provides trigger table - written by an evergreen dev but needs more attention from community in order to keep up.
Dan Scott: Trigger hinges on call numbers, availability change, etc. Status update not an issue in EG.
Q posed: Polling small table every 2 minutes, lightweight solution—but how do we integrate into TPAC?
A: RESTful API talking to Aleph
- Q: Do your patrons have a unified viewing experience?
A1: Yes, no Aleph screen. Does EG could have those APIs to tie in?
A2: Well...Dutch made this OAI that could be used.
Maybe for v2.5...
- Want unified experience for users, easy to maintain, update.
- 300 or 400 demo users, 40,000 demo records: indexing of this would be quick, but we would want to get this to real-time for sake of users.
- Ideally: one call to Solr & get one push of data at once to populate results “Everything at the wall” including desired record, rather than slow loading of fields.
- Availability status must be indexed, can’t do with OpenSRF calls. Availability status is a good feature; needs to be part of this Solr.
- Any other questions?
- Q:Any idea of cost or man-months for Solr to replace Evergreen search?
A: Don’t think could answer on-spot. With the ppl in this room, would be in a year, but requires a lot of commitment. Progress so far is result of 20 man-hours spread out over available free time.
- How would we get it backboarded into Evergreen v2.2? Not every library would be on board. Ideally: could be bolted into an install option...and not be hard to implement.
- Use PHP script & Bootstrap CSS for demo.
- Will get back about such as assessment. Offhand est: Could be a year & 60,000$.
- Preliminary: What do we want to index with Solr?
- This issue is not at an Evergreen Postgres level really more of a Ajax implementation issue.
- Search Interest group for this? No comment.
Impromptu: Demo for https://carleton.mnpals.net/vufind/ via Steve - Uses VuFind on top of Solr for (in-catalogue) discovery.
- Shows harvest from III, then Summon APIs: all into one display
- Zoom in on Books only and we get VuFind facets rather than Summon distinctions. Facets calculated separately based on item sought (catalogue items vs scholarly e-resources).
- Q: How long for implementation?
- A: Experience with VuFind + Solr, so no time at all. Talk w/ about consortium implementation of this.
- EBSCO has discovery product too, want to talk to about their API & how to integrate with ILS. Let’s chat about how to integrate with Evergreen, Solr, etc.
How to GITify Your Changes: The Easy Peasy Lemon Squeezy Way
[Notetaker’s notes: “...” indicates more content that was missed. Will try to fill-in later.]
- 1) How can save changes before upgrades in the Windows environment?
3) How to pull code from the community
ALL within windows environment
- “git is basically a snapshot of your file system (aka cloned repository that you’ve set up locally.
Every time you commit your project in git, it takes a picture of what all your files look like....
- Git in a nutshell - workflow
- Basically git workflow goes something
- close evergreen origin (master)
- mod files in your working branche
- stage files
- do a commit, which takes the files as they are in the
- Why on earth using Windows?
- Because everyone else is using Windows; easier to use same platform as your users (public & school libraries) Interoperability is an issue.
- git program tools for windows
- Smartgit - great GUI until realized can’t sign the commits
- gitcola is an intense program b/c requires python install, etc. Didn’t get past install stage
- Final choice: Git for Windows - still has some issues but overall best choice
- tortoisegit & Egit - have heard good things but not used
- How to use git locally: setting up git for windows
- assume that you’re using github or server available.
- 1) install for git for Windows (http://msysgit.github.com) will give you git bash & git GUI
- Open Git GUI & select “Create New Repository” & browse to your local C drive & create a folder, keep separate folders
- Under Edit> options you’ll want to add your name and email address for Global & local repos.
- If you need an SSH key, you can generate one via Help>ShowSSH Key
- Go to Remote and Add
- Name this repository and then enter the
- How to use git locally: Creating branches & pushing changes
- 1) go to Branch> Create
- 2) Add your files to your production server that you’ve already edited to the folder you set up previously.
- 3)Then scan for file changes.
- Am able to go in order correctly:
- … : Writing commit statements
- First line should be subj line, followed by blank ln, then a description, followed by a blank line, then a sign-off msg.
- 5) sign-off
- 6) Commit
- 7) Push
Using Git locally, upgrading and not losing changes.
- Require learning to “speak like a cool kid”
- rebase & merge
- rebase - instead of ugly merge commits
- handles attaching
- personally only rebases
- grabbing one commit from a developer to test, etc.
How to contribute to community using git
- create SSH key via PuTTYgen
- Send your SSH key to EG repository server
How do I connect to EG Repository?
- If you didn’t close the EG repository back when setting up...
- git gui open Git GUI & close it.
11 steps to contribute code to community
1) Open git gui & switch branches to ‘master’
2)Open git bash & cd to working repos on C drive
3)Type git pull
4)Switch back to git GUI and create a new branch called
username you were given/module description
5) Edit the file(s) in NotePad++ (recommended)
[Do not use tab! makes alignment iffy, use spaces/ convert tabs to spaces. in v.5.9, Edit -> Blank Operations -> TAB to Space]
7) Stage your change
8) Write your commit message
11) Push to your working branch
Can see the commitlog & double check
Don’t forget the bug ticket
- got to https://launchpad.net/evergreen & create an account
- Click report a bug
- Create a summary statement
- Report the bug by entertaining the information
- Expand the “extra options”
1) Copy commit ID from commit you want to test
- Open git gui, create new branch
- go to master branch
- Open git bash and navigate to “master” branch
- Type git fetch--all or git fetch branchName
- Type git checkout (branch name from Step 3)
- Type git cherry-pick --s (click Insert key to paste id# of commit message).
- We want to avoid merging (not easy).
Helpful git resources: PP slide of resources
Suggestion: Do it in baby steps
- Get used to pushing git changes to your own server
- then share with others
- then download other’s code & then experiment
Microdata Structured Data: Making Metadata Matter for Machines
Dan Scott | coffeecode.net
- TPAC presents metadata for humans
- What about the “toasters”?
- MARC is not machine readable (ironic)
- API that provides alt representation of metadata is available.
- RDF/RDFA repping RDF information within HTML
- ^These microformats weren’t gaining standards except for in Evergreen and other small places.
- A web page refers to many entities (ppl, places events, creative works)
- regular HTML links from an entity to another web pg are one way of identifying relationship that human and machines understand
- Knowledge graph - comes up from time to time.
- Freebase allows to scrape a bunch of resources & relationships & extend further.
- Semantic web -> now “linked data”
- RDF triples buzz initially after 2001, lay dormant for some time.
- With more linked data, wanted to do more things with this data -> Enter schema.org microdata
- Search engines wanted better metadata than plain HTML could provide
- Semantic web was not evolving in practice
- June 2011 - Yahoo announcement
- …[to be added later]
- Human view vs. HTML view
- HTML + schema.org (microdata)
- with a few additions, allows to ID valuable information (itemprop, itemtype,
- w/ out these microdata hints, harder to figure out content that is being given out.
- Search engine view: strips out all marked up schema.org microdata
- shows item, property, ‘keywords’ - good enough for now
- Things, not strings. Relationships, disambiguation.
- “Dan Wells” the author vs. famous Evergreen developer
- “Westside Story” as an adaptation of “Romeo & Juliet”
- Exposes & enhances library resources:
- Search engine results could link to local or preferred libraries
- example: great for libraries if not involved with WorldCat.
- physical or electronic resources
- Ross Singer’s Backbeat GreaseMonkey script, reads microdata (schema.org); client side.
- Could implement with a screen scrape, but you could repeat with Backbeat GreaseMonkey
- book, map, movie, MusicAlbum, Painting, Photograph, Sculpture
- Added entries: author, accountablePerson,
- Libraries are objects, too
- schema.org vocab defines “Thin>Organization>LocalBusiness>Library”
- Supports attributes such as:
- Address & contact info
- Opening hours
- Definite potential for linking outwards!
- Obvious for NA libraries: LoC & LoC microdata
- Could link to VIAF, but that is currently a dead end in terms of linking further out (at least in terms of structured data)
- As OCLC is currently working on schema.org in WorldCat, we will see what happens.
- Why common identifier between libraries? Just increase the knowledge graph and what resources are available in external engines. That reaches back to FRBR.
- schema.org: vocab in process
- amount of action in the W3C Schema.org Bibliographic Extension community currently low (Karen Coyle, etc.) -- it’s a library standards committee -- but has a lot of potential and great discussion
- Dan’s role: Reference implementation and using EG as a reference implementation. Helps to discuss not in the abstract.
- RDFa was designed with the idea that it would mix many different specialized vocabs in a single web page. Seven different vocabularies working together: messy.
- Schema.org documents all the objects and properties in a single vocabulary
- when published schema.org, generated a map of all URLs & sitemap - monitored crawling of Google/Bing.
- Google gave up after crawling 475,000 URLs, likely because a lot of internal links & considered not relevant to search engine users.
- Maxed out at around 8,000 identified structured data objects, but that dropped to a few thousand (probably due to the MusicRecording vs MusicAlbum mistake)
- Evergreen: schema.org state
- EG 2.2 through 2.4: 2 primary types: Book & MusicRecording*
- Just plain text for attributes
- A working branch improves this greatly.........
- Linking Out
- MARC makes things hard and a lot of MARC implementation is done through TPAC templates.
- Is it worth the effort in implementing?
- Opinion: Is a long term strategic decision, will pay off later.
- Like microdata, but a W3C standard that is backwards compatible with RDFa
- Currently only missing one relatively minor feature that microdata offers (copy & paste)
- Quote: mythical differences: RDFa Lite & Microdata
- “at this point, you may be asking yourself why are these two languages so similar? ...RDFa Lite is backwards compatible with RDFa while building on schema.org attributes.
Data Quality in Evergreen
- esp in bibliographic and cataloging data
- Q: What is a modern OPAC or discovery layer?
- A: A tool that catalogers use to check fixed fields.
- Very serious point where structure does not mean that there is a quality experience.
- Quality data is absolute:
- if it breaks EG, it should stand to be impoved
- XML has to well-formed at min.
- Concrete ex: bibliographic authority, serial expression is in XML.
- XML isn’t the only one where structural integrity matters, MARC is another.
- In one sense, the structural data quality issue is the least interesting b/c unless too horribly broken, can convert horrible structure into something better.
- Position 09 indicates that character encoding. If blank, then means MARC8 encoding (ugly).
- At heart, interesting for those interested in languages, but at end of the day, just want to display correctly.
- Sometimes structure gets in the way of TPAC. Perform a search, get 20 results, but only 2 show up.
- Data existentialism: If a bib record lacks a 901 field, does it really exist? Answer: No.
- Access has to be the foundation of data quality in a catalog. Some data is better than none in almost any circumstance.
- This is where you have to take into account a very important context. J. Mac Elrod (BC cataloguing maven): “A record view of a catalogue is not enough. What we’re ideally building is a catalogue as a unified entity for the library’s users.”
- Ideally, you are building a catalogue , not just a pile of MARC records.
- to each catalog its audience
- and its maintainers
- Some access is better than no access
- A metadata record that doesn’t exist, quality is pointless. Without metadata, items cannot be found and items do not describe themselves.
- You may manage to make an item unfindable if quality is not accurate.
- More important: linking to it in the future.
- Much string processing and matching to do going from current way of cataloguing to linked data.
- Matters a lot to modern catalogers.
- Again, means item could be made unfindable if not in a way patrons expect.
- Can only be answered by talking to reference librarians or public for context.
- too little
- too much
- too much
- example: shared data in consortial catalogue in addition to library’s own records (duplicates).
- has a lot to do with everyone in a single room and deciding on a common cataloging policy.
- Automated checking
- being able to catch potential errors.
- libraries have a lot of potential to contribute.
- tool for identifying various issues in bibliographic records w/ rule violations, etc.
- command line tool to check for well formatting in XML, saves time if trying to load bibliographic records
- Overheard: call number vs. prefix reports (Jeff Godin)
- Batch updating
- A gathering of data-ticians
- Call for data tips, tools.. Clearinghouse on Wiki for this info
How to Choose the Right Hardware for PostgreSQL and Evergreen
Joshua Drake (Command Line, Inc.)
- Postgres experts
- Like Robert DeNiro in Brazil!
- helps you with your ducts
- “I’ve been up so long, there’s not a damn thing left in this environment that scares me.”
- you probably broke it!
- what did they do to the OS?
- how is Postgres configured?
- let’s document it!
- catch 99 out of 100 problems that everyone runs into so we can work on the interesting problems
- Educating people on PostgreSQL
- what are we talking about today?
- EG is not special to Postgres
- typical Requirements
- lots of CPU
- decent amount of RAM
- a bit of I/O
- CPU > memory > I/O
- I/O ALWAYS first
- then RAM,
- then CPU
- PG is process-based
- Why the difference?
- indexes don’t actually contain the data
- random writes to relations
- sequential scans of large tables
- queries doing weird things
- reporting (don’t do it on your main server!)
- writes must satisfy the (D) on ACID (writes to WAL log first)
- EG: search queries
- EG: hold query based on search
- by default, very slow and painful
- rule #1 hardware RAID controller with Battery Backup Unit
- rule #2: only 2 levels: 1 and 10
- anything else is wrong!
- (except perhaps RAID 1+0 or 0+1)
- rule #3: it is better to purchase 14 small drives than 7 big drives
- standard media types of I/O
- yeah, it’s slower, but USENIX study with fiber-channel, SAS & SATA, and SATA was more reliable than any of them!
- nothing wrong with it
- need more drives because it spins slower
- reasonably priced now, very fast.
- can make it faster by using “short stroking”
- prices are coming down
- have to be very careful: very few are safe
- no power loss protection in user-grade SSD
- Intel 320, 710
- OCZ R Serie
- can get a terabyte for $400
- most of these are consumer-grade drives
- SSD random write performance
- 364 Megs/second
- pretty impressive! (link source on slide)
- a HW RAID controller is required
- a BBU is required
- one specific purpose:
- if someone pulls a power cord, the data just waits for you (3 days to a week)
- (of course, there are exceptions)
- better to buy enterprise stuff
- RAID 1 & 10 are the only ones that matter
- you can get some performance for shared reads, it depends
- perfect for transaction logs in postgresql
- RAID 1+0: can possibly survive a 2-disk failure
- minimum 4 spindles
- you will not find a faster array
- direct correlation between # of spindles and speed of DB
- add spindles
- SAS drive averages only 25MB/s
- “is that performance I smell?”
- the larger the drive, the slower it performs
- it’s just the way it is
- may partition a large drive to only use a small portion to keep the speeds up
- Buy more memory!
- INtel stuff is faster, but more cores in AMD, consider buying it instead
- anything less than Postgres 9.2, upgrade!
There and Back Again, Again
Rogan Hamby, Robin Johnson & Galen Charlton
- impact on networking in EG
Why Youtube kills your bandwidth
- bandwidth over time
- kid comes in to watch educational videos at the same time each day & kills your bandwidth!
- network Saturation → higher latency
- pingplotter is a great windows tool that avoids needing to do a lot of scripting
- slows down your circulation
- Traffic Shaping
- static vs. Dynamic
- dynamic: if nothing else is using the link, it’s OK for bittorrent to use it all
- potential traffic hierarchy:
- 1 evergreen (bursty, small)
- 2 staff
- 3 patrons
- 4 bulk (video, P2P, etc)
- a need to work with the staff to manage their expectations
- “my workstation, etc. doesn’t work unless i reboot the router”
- DHCP lease settings
- trim down the timeout times (default is a whole day)
- transparent, SSL-interception
gathering some intelligence
- packet inspection solutions not terribly effective for the money
- generalized assumptions for stream shaping work better
- gathering statistics of EG impact
- what will the impact on my system be?
- not a lot of public info out there yet
- could be helpful in communicating with vendors
Networking within EG
- recent Serious Postgres security release
- never expose DB port to the outside world (5432)
- should always know which ports you are exposing
- Linux Virtual Server
- nice, but tricky to config
- make sure application servers always route through load-balancing servers if on same subnet
- defaults on Debian...
- upshot is: pay attention to networking within EG cluster
- graphical linux tools for those less familiar with CLI
- pingplotter takes a simple .ini file
- can send the directory to someone, get them to run it for a week, zip it up and send it back
- little openBSD boxes, PF, little $500 box to collect everything
- sometimes you need to monitor for a month to find the real problems
- EG servers over satellite connection
- an exercise in masochism?
- a difficult one to solve because of high latency. can’t just stick a caching proxy in front of it
- try to get something better than satellite installed (apply for grants to finance improved connections, can’t fulfill core mission otherwise!)
- Farmers in Wales digging their own fibre trenches!
- easier if the load balancer has 2 interfaces
- don’t want jabber port open either
- load balancer also serves as the default gateway
- besides load balancing, http & https, you can can use it to port-forward SIP service, Z39.50 service, etc.
Teach a Man to Fish; Or, How I Stopped Worrying and Learned to Solve Problems with Open SRF
Mike Rylander & Erica Rohlfs
Teach a man to ShareStuff: A followup to Sharing is Caring
- crazy idea to build a thing to share things that exist in EG across instances
- receive objects from an evergreen instance, and deliver them to other EG instances
- one rule to go by when writing
- mike does not make pretty interfaces (sorry in advance)
- sharestuff is web-based
- the community is librarians, rather than patrons
- select a search filter
- subscription patterns and captions
- ordered by first rating
- second ordering is by title
- click on title, detail page
- basic metadata that surrounds any data object
- each object might be usable in multiple versions of EG
- you determine this as a provider
- description field is free text (room for UI improvement)
- view created objects
- can be private so they won’t show up in search
- captions and patterns wizard
- will be more search options added
- interface, both web interface and popup within EG
- based on template toolkit
- delivers API for every screen that will be built
- xml file everyone loves to hate (?)
- specifically allows you to work in other languages
- giving the option to work in other languages to potential developers
- stole from EG:
- slimmed it down a little bit
- for gathering little bits of data
- only used by EG right now, may become useful for other open source projects to use in general
- haven’t changed much over their lifetime
- self-contained objects
- removing or changing views affect where you’ll be able to run it
- with a bit more work to record dependency info, it should be doable
- EG version dependent (EG macros change)
- don’t want to break people’s receipt printing
- will just put it up in a git repo
- also have a standard instance
- our hope is that a single instance that everyone can join will encourage people to share studd
- next step: getting buy-in and help from everybody
- adding SCAP integration into 2.5
- would like people to extend it into other objects
- is the idea that ShareStuff will be able to tell from looking at it what version it will work with?
- initially, it will look at the version it came from
- i don’t see why not
- they have the ability to hold info that can’t be used by other systems (custom statuses, shelving notations, etc.)
- could just discard things it doesn’t understand, or refuse to work if it finds something it doesn’t know
- EG will have object tracking
- will register the installed package and point to the actual rows in the DB that contain it
- when a row changes, it will check the registered packages table
- will marked the registered package as changed
- there will be an interface in the admin section that will allow you to list packages
- if someone changes their publication pattern, they can push out an upgrade to that package
- probably one area where we’d want to put in a lot more logic
- you need to fill in parameters that don’t make sense in the new context
- will take more work
Home Grown RFID for Evergreen
Bill Ott (Grand Rapids Public Library)
- a very much do-it-yourself shop
- considered the idea of writing their own ILS
- own notification syem, with telephony
- telephone renewal system 2500 renewals / month
- still had this big ball and chain: magnetic strip security system
- weigh about 700 lbs
- very expensive
- use SIP
- toyed with idea of USB-powered sensitizer/desensitizer
- really wanted RFID
- local manufacturing facility
- a box of 100 CDs moving at 70 ft/sec
- but no software to do what we wanted to do!
- using UHF, not HF (kind of a holy war)
- more sensitive to metallic shielding
- different ranges
- APIs were for .NET & C#
- would require software licenses!
- liked the idea of reading tags at a distance, knowing what was going on
- hired interns!
- hired 2 recent graduates, went to work writing software
- bought development kits
- doing R&D:
- antennas out of tin cans and wire
- duct taped, readers to chutes
- a lot of fun!
- got some working software: self-check & door security software
- took a small collection & small inital self-check
- realized pretty quickly that they didn’t want to run windowws on something that needed to run all the time
- after 6-8 weeks (300 circs): let’s run with it!
- local tag manufacturer:
- for a couple cents per tag, we can print on and encode them for you.
- just peel & stick (in theory)
- may be buying tags that would never end up in a book, but it still seemed worth it
- little bags of tags (“dimebags”)
- grab a bag or roll and scan them: this set of items is tagged
- go and tag items, and then untag missing ones
- kind of backwards, but it worked
- had to deal with items that were in circulation at the time later
- tagged every disc in a set, as well as the package
- 30-part items are a bit hit-and-miss
- check-out: search for the number of parts and alert the patron if things are missing
- max of 5 or so. alert patron to check if item has more tags than that
- software to printout lists in shelving order (not necessarily call number order)
- worked pretty well!
- part-time technician knew C#. re-wrote door security code
- mono, runs C# code in linux
- want to read everything as it comes by
- reads better when things are in motion
- if someone starts putting an item in and then changes their mind....
- need shielding! (cobaltex)? shieled fabric
- waited a few months after checkout before sarting checckin
- a lot of reports: more of a process issue in getting materials where they need to be
- a little flashing LED buzzed made by hand
- originally, readers were on all the time, triggering accidentally
- now it turns on based on a detector
- at checkin, we only listen for a response from the A state
- will wait for a few seconds to minute before it responds again (good and bad)
- openSRF to perform checkout & checkin
- can check a stack of items in a second or two
- had some challenges as it’s not a serial operation
- EG expects to check things one-at-a-time
- do all the normal checks (fines, number of items, etc.)
- accept fine payments at the kiosk via web interface
- readers can read 400 tags/second
- mounted under an inch of marble: no problem at all
- passively read and record every item that passes in and out of the building
- we write to a table before it’s checked out
- did they read it and not check out for some reason?
- can kill discarded tags forever
- can lock with password, so you can’t change/erase tags
- chose not to set a checkin/checkout flag on the tag itself
- look at the part counts
- instantly see how many parts an item should have
- it has been a very interesting experience!
- hope to release the code some day
- windows app, but you can run it under mono
- interns were good at giving something that works, but code needs cleanup
- need to remove proprietary info
- have abstracted out portions of the code, so you should be able to swap out the reader and use another one
- code is ugly, but it works
- do you have tags for discs?
- under $400,000 for entire project
- 8 libraries, 1 million tags
- door security on 8 locations
- inside and outside book drops with action triggers
- big sensors are about $800
- not all the tags all the time? how often does that happen?
- more useful for finding misplaced items
- how do you catalogue multi-part items in EG
- part counts are all on the reader side
- one record in the EG side
Works with Evergreen: Practical Approaches to Integration
Jeff Godin (Traverse Area District Library)
- API related stuff
- part one: the now. what’s possible currently
- part two: the future.
Part 1 Practical Approaches
- THings that integrate with EG (not a comprehensive list)
- not counting things like barcode scanners
- OCLC streaming MRC records srraight into your system
- all of these talk to EG via SIP except staff RFID
- library media lockers for hold pickup
- at least one example talks SIP
- Z3.50, SRU (serer and client)
- content cafe, chilifresh, etc,
- EG making use of a vendor API
- online payments (PayPal, etc.)
- might batch load form Safari or OverDrive
- EG hides place/hold functionality for purely electronic resoutrces
- authentication (EZProxy) to log into OverDrive)
- PC management talks SIP to ILS
- using SIP to authenticate patrons to wireless network
- can look at individual fields in the SIP response
- one vendor: aerohive(?)
- people send telephone /email /postal notifications
- unique management (collection / recovery)
- hoopsy, libraryAnywhere
- collectionHQ: analyse a lot of EG data to help you make certain decisions
- using EG to share resources
- or use NCIP responder
- can eleiminate duplicate entries
- parton info all in one place
- perl, literally typing in CLI sessionj
- jasper reports - tie directly into SQL database using ddi
- just writing SQL using templates
- do you have a product / tool / script?
- go to blog post (evergreen-ils.org/blog/?p=966) by Galen
- how are these things tying onto EG?
- how could EG make it easier for these things to integrate?
EG Integration Points
- APIs / standards
- SIP2, Z39.50, EDI, RSS feeds, OpenSearch
- natively over XMPP or jabber
- can use the OpenSRF native libraries - a lot of options
- not everything has that capability - needs a translator
- a series of urls that allow you to request records in different formats
- e.g. top 10 record in a MARC format
- can work, but run the risk of a lot of things breaking when you upgrade
- should be calling the appropriate openSRF method
- risk breaking integration with local vendor
OpenSRF client Libraries
OpenSRF gateway and translator
Using the gateway
- method names typically start with the service name, single parameter, JSON string (needs the quotes)
- need to use the XML definition (IDL and fieldmapper)
- can request chunks on demand, not only the whole thing at once
- can use the existing implementations or make your own for a different language
- avoid direct db manipulation
- don’t hard-code array positions (they can change!)
- avoid screen scraping
“let me sing you the song of my people”
New RESTful API
- can say:”EG 2.5 & 2.6 support v1 of API”, etc.
what it’s not:
- not replacing OpenSRF
- not intentially useful for large batches