Response from the German RKI on Virus Sequencing
A pending lawsuit urges the organization to reply to my uncomfortable questions!
I submitted a Freedom of Information Request to the German CDC equivalent, the Robert Koch Institute (RKI), with four questions regarding the sequencing of SARS-CoV-2 and other viruses. This request was similar to one I previously sent to the CDC, where they confirmed that sequencing was not conducted solely on genetic viral material. Additionally, they acknowledged that other methods for confirming the sequence had not been performed. Read the full report here.
Here’s the answer from Germany (translated from German):
After filing a lawsuit against the German CDC equivalent, RKI (Robert Koch Institute) due to their failure to respond to my Freedom of Information request, the RKI provided a response 139 days late. In their response, the RKI entirely rejected my request, claiming either that the requested information does not exist or that all relevant information is already publicly available. This approach appears to aim at shifting at least part of the legal costs onto me, even though the response was issued only after the lawsuit was filed.
Question about the isolation of viral genetic material before sequencing
My inquiry whether the RKI had sequenced a virus that was physically isolated from other genetic material prior to sequencing was simply sidestepped. The RKI stated that such information is publicly accessible. However, it can be confirmed that this is not described in any public publication. Thus, the RKI indirectly confirms that no sequencing of purified, exclusively virus-originating genetic material has ever been performed.
Examination of the reference genome and related genomes:
In response to my question about whether the RKI had internally examined the SARS-CoV-2 genome and its close relative RaTG13, the RKI stated that no such investigations had been conducted at the institute, and therefore no records exist. This indicates that the RKI adopted the reference genome published by Wu et al. (2020) without conducting further verifications. The reference genome was originally derived from a single patient and constructed without additional scientific controls, such as sequencing PCR negative patients with similar etiology from the same hospital.
It remains unclear whether the RKI refers to these specific lineages or generally rules out having examined SARS-CoV-2.
According to the RKI, the protocol used does not allow sequencing of the virus’s terminal regions, although this is deemed irrelevant for lineage determination. While this may be true for lineage determination, it is relevant for the initial validation of the sequence (i.e., origin and clinical relevance).
A search for de novo assemblers (e.g., SPAdes, MEGAHIT, Trinity) in the official RKI pipeline repository (https://github.com/rki-mf1/CoVpipe2) yielded no results. Instead, only the “mapper” or aligner bwa mem was found. The diagram in the README confirms that only mapping for variant determination is performed, not de novo assembly of raw data.
RNA extraction in RKI studies includes total RNA, which includes host RNA and RNA from other organisms:
All three studies exclusively use alignment methods (mapping) to reconstruct the SARS-CoV-2 genome, instead of performing de novo assembly.
Lack of de novo assembly
The RKI has no samples or datasets where the reference genome could be perfectly reconstructed using a de novo assembler (e.g., MEGAHIT).
Absence of the RACE methodology
The RACE (Rapid Amplification of cDNA Ends) method, which allows for the amplification and validation of genome ends, was not performed by the RKI.
Conclusion
The RKI indirectly confirms that no sequencing of genetic material exclusively of viral origin was conducted. As a result, the origin and validity of the sequence cannot be verified with certainty. Moreover, the RKI omitted critical validation steps required to ensure the structural integrity of the sequence as an independent single-stranded RNA genome of the SARS-CoV-2 virus.
Opinion
The methodological ambiguities raise questions about the origin of the genetic material as well as the clinical significance and validity of the identified sequences. It remains unclear whether these sequences are truly specific to the presumed pathogen or whether they may partially originate from other material present in the host or the environment. Such gaps could fuel speculation about the objectives of the RKI’s genomic activities.
" The reference genome was originally derived from a single patient and constructed without additional scientific controls, such as sequencing PCR negative patients with similar etiology from the same hospital."
How could anyone sequence "PCR negative patients"? If they "negative" then nothing to sequence.
Would it be better to write: take equivalent samples from the "PCR negative" patents and subject these samples to sequencing in parallel with samples from "PCR positive" patients?
I don't understand.