Beta Pair Bulk Observations- Part 2
The Pair bulk data people have changed the api so I thought I'd start over. The
original file is here
- The titles come back in all uppercase in the downloaded data. In the preview pane it's all uppercase too but if I copy it and paste it it's in mixed case like patft shows (copying what's displayed in the client
on pairbulkdata.ustpo.gov no longer does this?). The markup seems to be displaying the title in uppercase. CoNfUsEd Am I.
- I selected plant as the apptype and patented case as type and then downloaded the dataset. 2015.json in the zip file only has two plant patents which is a few thousand short. I then cleared the selections and entered a plant patent number sequentially after one in the json. Data was returned online but that same patent was not in the downloaded data.
- Page showing the patent data that comes back
for each query. The other sample pages show only a subset of the data. They display only some
of the available columns to make a spreadsheet analogy. This page is more of a full disclosure thing. If the kitchen sink is being returned this page will show it.
- I updated my
app inventor adroid app so it's working again!
- The data still isn't complete when compared to patft. I tried to search for all plant patents (class of plt). The results I get back say there were
11 plant patents in 1979 while patft says 176 (from ccl/plt/$ and isd/1979$)
Other years are a few short ex 1984 205 vs patft's 212. Also the current year is lagging behind patft. Patft updates each Tueday. On Sept 28, 2016 paft has 938 plant patents for 2016 while pair bulk data has 165.
- It looks like you can have a secondary sort! ex:
"sort":"patentIssueDate desc, patentNumber desc" but I didn't see it
documented anywhere. Syntactically I think it should be
"sort":["patentIssueDate desc", "patentNumber desc"] but that threw an error.
On notitle.htm a sort of "patentTitle asc, patentIssueDate asc" seems to partially work but "patentTitle asc, patentIssueDate desc" doesn't do what I want it to do.
- Here's a comparison with patentsview's api
- ccl's (appClsSubCls) like PLT/263.1 come back as PLT/263.100
as shown here. A search for
PLT/263.1 returns no results, results are returned for PLT/263.100
- Leading zeroes are added to the subclass if it has fewer than three digits.
ex: PLT/65 comes back as PLT/065. Like the previous item, a search for
PLT/65 returns no results while PLT/065 returns results. To combine this with the previous item, there are subclasses where both things apply. Ex: 435/069.400
- It is True that I am about as boolean fluent as a person can be but I must confess that some of the statements in
the Pair Bulk Data FAQ have me stimied. I'll buy that
but I cannot wrap my mind around any of these assertions:
- three plug monomers porous is the same as: three AND plug AND monomers AND porous
I'd like to know where the apple and banana [cake] went in the first three and
where cat or dog went in the fourth (they are missing from the right sides
of the colons). Maybe it's something like this:
"How come when you mix water and flour together you get glue...and then you add eggs and sugar and you get cake? Where does the glue go?" -Rita Rudner
- apple OR banana three plug monomers porous NOT polymer is the same as: (three OR plug OR monomers OR porous) NOT polymer
- +apple +banana three plug monomers porous -polymer is the same as: (three AND plug AND monomers AND porous) NOT polymer
- apple - "banana cake" three plug monomers porous -polymer is the same as: (three OR plug OR monomers OR porous) NOT polymer
- sensor AND (cat OR dog) /n three AND plug AND monomers AND porous NOT polymer is the same as: three AND plug AND monomers AND (porous NOT polymer)" [sic]'s on the stray /n and lone double quote.
(not my typos, merely pointing them out)
Do me a favor and bet $2 to show if you are at a race track and a horse named Cat or Dog is running
in the fourth...
- Suggestion: provide the Swagger 3/OpenAPI version of the api's swagger object. I used https://lucybot-inc.github.io/api-spec-converter/ to convert the pairbulk json object from swagger 2 to Swagger 3/OpenAPI and put the results here for people to try. The 2.0 version is available from this page on their web site. It would be nice to provide both versions of the json objects. Tools exist to do different things with various versions of swagger objects. Having both available maximizes flexibility.
- Cooperative Patent Classifications (cpcs) are not being returned for patents.
- Patents can have multiple uspcs (appClsSubCls) but Pair Bulk only returns one per patent. Here's an example
where only one of four upscs is returned and the patent's cpc
is not returned.
- It looks like data for patents yet to be issued is available.
The patent number and issue date are not filled in but everything else is.
Check out the Future Patents page and
Patent Data page. The presence of the
future data initially confused me, making me think that my searches
were wrong as explained here.
One last way to see this is
- There is a patent with number 9,999,905 whose title is Testing as
shown here. As of 2017-11-28 utility patents are up to 9,832,920. With around 6,000 utility patents issued per week the collision will occur in 27 weeks!
- I picked an app status of patented case at
Several patents have N/A titles
- There are 35 patents with a
US classification of XXX/XXX.XXX
which does not exist.
- Similarly there are 347
with a non existent US classification
- A search for a
of 999.999 in any class says there are 70,973 results.
- There's a patent with an
application type "N/a". There's also a problem with an application type of provisional. The title comes back in the patent number field and the patent number str field gets commas added.