The Fraudulence of Most Patent Citation Analysis
Patent Citation Analysis
The most popular method of "objective" patent analysis is patent citation analysis, i.e., forming theories about information flows between inventors, based on how they cite each others patents; or forming theories on how important patents are (especially for valuation purpose) based on how they are cited, etc. Problem is, most of these studies are nonsense because the economists analyzing patent citations make many false assumptions about the creation of patent citations. For the most part, patent citations measure nothing about the relationship of two patents.
Statistical unreliability of citation analysis
To appreciate this nonsense, one can turn to two of the "deans" of citation analysis, Adam Jaffe, economics professor at Brandeis University, and Manuel Trajtenberg, economics professor at Tel Aviv University. In 2002, MIT Press published a 460 page book by Jaffe and Trajtenberg titled "Patents, Citations and Innovations", a 'classic' in the field.
On page 27, they quote, not their own analysis, but a US OTA
report from 1976 to explain the importance of patent citations:
"During
the examination process, the examiner searches the
pertinent portion of the 'classified' patent file.
His purpose is to identify any prior disclosures of
technology ... which might anticipate the claimed
invention and limit the scope of patent protection
... If such documents are found ... [they] are 'cited'
in any patent which matures from the application.
Thus, the number of times a patent document is
cited may be a measure of its technological
signficance."
To the contrary, the number of times a patent document is cited is a measure of little significance. First, it is well known that a large percentage of issued patents are invalid in light of prior patents the examiner failed to find, overlooked prior art which is more "technologically significant" to the patent in question, and thus of more technological relevance. These uncited patents render the cited patents to be of much less statistical significance.
Additionally, any information flows between inventors and companies based on these citations can not be shown to exist where a) the examiner finds the patents to be cited, b) a professional searcher finds the patents to be cited, or c) the inventor or his/her lawyer finds the patents during patent preparation (and thus after the information flows of the inventive process). Patent citations also have to be corrected by: removing erroneous citations, duplicate citations, and citations that are not citations (e.g., court records relating to the patent).
Further, an inventor can be inspired by a journal article to make an invention (making the journal article of some technological relevance), but for the patent application cite only lesser relevant prior patents, making the patents of no relevance. Further, for the majority of issued patents that are invalid in light of (non)patent prior art, and thus of no innovation relevance, citation analysis including such patents is further statistically flawed.
Yet, without knowing which patent citations are so affected by these defects, one cannot make any conclusion with regards to conclude technological relevance using citation analysis - it's GIGO.
Yet Jaffe and Tratjenberg make no acknowledgement of these flaws, rendering their use (and others use such as Bronwyn Hall, an economics professor at U.Cal Berkeley) of patent citations mostly statistically meaningless. Sadly, the Acknowledgment section of their book doesn't acknowledge the help of anyone with expertise in the process of prior art searching.
What follows is a short of list of papers that expose many of the statistical defects of patent citation analysis, followed by a much longer list of flawed papers on patent citation analysis. If someone mentions to you 'patent citation' - watch out, they mostly want to steal your money.