Beta Pair Bulk Observations- Part 2

The Pair bulk data people have changed the api so I thought I'd start over. The original file is here.
  1. The titles come back in all uppercase in the downloaded data. In the preview pane it's all uppercase too but if I copy it and paste it it's in mixed case like patft shows (copying what's displayed in the client on no longer does this?). The markup seems to be displaying the title in uppercase. CoNfUsEd Am I.
  2. I selected plant as the apptype and patented case as type and then downloaded the dataset. 2015.json in the zip file only has two plant patents which is a few thousand short. I then cleared the selections and entered a plant patent number sequentially after one in the json. Data was returned online but that same patent was not in the downloaded data.
  3. Page showing the patent data that comes back for each query. The other sample pages show only a subset of the data. They display only some of the available columns to make a spreadsheet analogy. This page is more of a full disclosure thing. If the kitchen sink is being returned this page will show it.
  4. I updated my app inventor adroid app so it's working again!
  5. The data still isn't complete when compared to patft. I tried to search for all plant patents (class of plt). The results I get back say there were 11 plant patents in 1979 while patft says 176 (from ccl/plt/$ and isd/1979$) Other years are a few short ex 1984 205 vs patft's 212. Also the current year is lagging behind patft. Patft updates each Tueday. On Sept 28, 2016 paft has 938 plant patents for 2016 while pair bulk data has 165.
  6. It looks like you can have a secondary sort! ex: "sort":"patentIssueDate desc, patentNumber desc" but I didn't see it documented anywhere. Syntactically I think it should be "sort":["patentIssueDate desc", "patentNumber desc"] but that threw an error. On notitle.htm a sort of "patentTitle asc, patentIssueDate asc" seems to partially work but "patentTitle asc, patentIssueDate desc" doesn't do what I want it to do.
  7. Here's a comparison with patentsview's api
  8. ccl's (appClsSubCls) like PLT/263.1 come back as PLT/263.100 as shown here. A search for PLT/263.1 returns no results, results are returned for PLT/263.100
  9. Leading zeroes are added to the subclass if it has fewer than three digits. ex: PLT/65 comes back as PLT/065. Like the previous item, a search for PLT/65 returns no results while PLT/065 returns results. To combine this with the previous item, there are subclasses where both things apply. Ex: 435/069.400
  10. It is True that I am about as boolean fluent as a person can be but I must confess that some of the statements in the Pair Bulk Data FAQ have me stimied. I'll buy that
    • three plug monomers porous is the same as: three AND plug AND monomers AND porous
    but I cannot wrap my mind around any of these assertions:
    • apple OR banana three plug monomers porous NOT polymer is the same as: (three OR plug OR monomers OR porous) NOT polymer
    • +apple +banana three plug monomers porous -polymer is the same as: (three AND plug AND monomers AND porous) NOT polymer
    • apple - "banana cake" three plug monomers porous -polymer is the same as: (three OR plug OR monomers OR porous) NOT polymer
    • sensor AND (cat OR dog) /n three AND plug AND monomers AND porous NOT polymer is the same as: three AND plug AND monomers AND (porous NOT polymer)" [sic]'s on the stray /n and lone double quote. (not my typos, merely pointing them out)
    I'd like to know where the apple and banana [cake] went in the first three and where cat or dog went in the fourth (they are missing from the right sides of the colons). Maybe it's something like this: "How come when you mix water and flour together you get glue...and then you add eggs and sugar and you get cake? Where does the glue go?" -Rita Rudner
    Do me a favor and bet $2 to show if you are at a race track and a horse named Cat or Dog is running in the fourth...
  11. Suggestion: provide the Swagger 3/OpenAPI version of the api's swagger object. I used to convert the pairbulk json object from swagger 2 to Swagger 3/OpenAPI and put the results here for people to try. The 2.0 version is available from this page on their web site. It would be nice to provide both versions of the json objects. Tools exist to do different things with various versions of swagger objects. Having both available maximizes flexibility.


  1. Cooperative Patent Classifications (cpcs) are not being returned for patents.
  2. Patents can have multiple uspcs (appClsSubCls) but Pair Bulk only returns one per patent. Here's an example where only one of four upscs is returned and the patent's cpc is not returned.
  3. It looks like data for patents yet to be issued is available. The patent number and issue date are not filled in but everything else is. Check out the Future Patents page and Patent Data page. The presence of the future data initially confused me, making me think that my searches were wrong as explained here. One last way to see this is here.
  4. There is a patent with number 9,999,905 whose title is Testing as shown here. As of 2017-11-28 utility patents are up to 9,832,920. With around 6,000 utility patents issued per week the collision will occur in 27 weeks!
  5. I picked an app status of patented case at Several patents have N/A titles
  6. There are 35 patents with a US classification of XXX/XXX.XXX which does not exist.
  7. Similarly there are 347 plants with a non existent US classification of PLT/999.999.
  8. A search for a subclass of 999.999 in any class says there are 70,973 results.
  9. There's a patent with an application type "N/a". There's also a problem with an application type of provisional. The title comes back in the patent number field and the patent number str field gets commas added.