The example of the two CSV files is attached below. One of the file is the crime database and the other one is the suspect database. Specification Your task is to write a python program that will take three CSV file names on the command line. The first CSV file contains STR counts for DNAs found in a list of crime scenes; the second CSV file contains a list of suspect's names and their DNA sequences; the third CSV file name is the output file where you write a CSV file that maps each suspect's name to the list of Crimes that have DNAs matching the suspect. Your program will take three file names on the command line: The first command line argument is a file name for a crime database file in csv file format. The header row would look like this: CrimeID,STR1,STR2,STR3,... Where each STRi is a short DNA sequence that is composed of DNA bases A/C/G/T. And each row will comprise of a CrimeID of the form CIDXXXXX, an integer count for each of the STRs. The second command line argument is a file name for a suspect DNA database file in csv file format. The header row would look like this: Suspect,Sequence And each row will have a Suspect name and the suspect's DNA sequence The third command line argument is a file name for the program to write the matching result to also in csv file format. The header row would look like this: Suspect,Crimes And each row will have a Suspect name and all the crime ids where suspect's DNA matches. The crimeids will be stored as a ',' separated string that will be treated as one csv file cell value (meaning it will be escaped with "" if there are more than one matching crimeid for a suspect.

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

The example of the two CSV files is attached below. One of the file is the crime database and the other one is the suspect database.

Specification

Your task is to write a python program that will take three CSV file names on the command line. The first CSV file contains STR counts for DNAs found in a list of crime scenes; the second CSV file contains a list of suspect's names and their DNA sequences; the third CSV file name is the output file where you write a CSV file that maps each suspect's name to the list of Crimes that have DNAs matching the suspect.

Your program will take three file names on the command line:

  • The first command line argument is a file name for a crime database file in csv file format. The header row would look like this:
    CrimeID,STR1,STR2,STR3,...
    Where each STRi is a short DNA sequence that is composed of DNA bases A/C/G/T. And each row will comprise of a CrimeID of the form CIDXXXXX, an integer count for each of the STRs.
  • The second command line argument is a file name for a suspect DNA database file in csv file format. The header row would look like this:
    Suspect,Sequence
    And each row will have a Suspect name and the suspect's DNA sequence
  • The third command line argument is a file name for the program to write the matching result to also in csv file format. The header row would look like this:
    Suspect,Crimes
    And each row will have a Suspect name and all the crime ids where suspect's DNA matches. The crimeids will be stored as a ',' separated string that will be treated as one csv file cell value (meaning it will be escaped with "" if there are more than one matching crimeid for a suspect.
CrimeID, AGATC,TTTTTTCT,AATG, TCTAG,GATA, TATC, GAAA, TCTG
CIDO0000,15,49,38,5,14,44,14,12
CIDO0001,31,21,41,28,30,9,36,44
CIDO0002,9,13,8,26,15,25,41,39
CIDO0003,37,40,10,6,5,10, 28,8
CIDO0004,37,47,10,23,5,48,28,23
CIDO0005,25, 38,45,49,39,18,42,30
CIDO0006,46, 49,48,29,15,5,28,40
CIDO0007,43,31,18,25,26,47,31,36
CIDO0008,46,41,38,29,15,5,48,22
CIDO0009,7,11,18,33,39,31,23,14
CIDO0010,22,33,43,12,26,18,47,41
CIDO0011,42,47,48,18,35,46,48,50
CIDO0012,9,13,33,26,45,11,36,39
CIDO0013,18,23,35,13,11,19,14,24
CIDO0014,17, 49,18,7,6,18,17,30
CIDO0015,14,44,28,27,19,7,25,20
CIDO0016,29, 29,40,31,45,20,40,35
CIDO0017,6,18,5,42,39,28,44,22
CIDO0018,37,47,13,25,17,6,13,35
CIDO0019,29,27,32,41,6,27,8,34
CIDO0020,31, 11,28,26,35,19,33,6
CIDO0021,26,45,34,50,44,30,32,28
CIDO0022,29,50,18,23,38,24,22,9
CIDO0023,21,38,33,24,16,11,6,18
CIDO0024,13,7,13,37,8,8,20,40
CIDO0025,19,14,38,33,39,32,37,20
CIDO0026,9,32,35,25,27,29,20,37
CIDO0027,12,6,25,10,38,12,35,18
CIDO0028,40,21,8,32,35,35,13,38
Transcribed Image Text:CrimeID, AGATC,TTTTTTCT,AATG, TCTAG,GATA, TATC, GAAA, TCTG CIDO0000,15,49,38,5,14,44,14,12 CIDO0001,31,21,41,28,30,9,36,44 CIDO0002,9,13,8,26,15,25,41,39 CIDO0003,37,40,10,6,5,10, 28,8 CIDO0004,37,47,10,23,5,48,28,23 CIDO0005,25, 38,45,49,39,18,42,30 CIDO0006,46, 49,48,29,15,5,28,40 CIDO0007,43,31,18,25,26,47,31,36 CIDO0008,46,41,38,29,15,5,48,22 CIDO0009,7,11,18,33,39,31,23,14 CIDO0010,22,33,43,12,26,18,47,41 CIDO0011,42,47,48,18,35,46,48,50 CIDO0012,9,13,33,26,45,11,36,39 CIDO0013,18,23,35,13,11,19,14,24 CIDO0014,17, 49,18,7,6,18,17,30 CIDO0015,14,44,28,27,19,7,25,20 CIDO0016,29, 29,40,31,45,20,40,35 CIDO0017,6,18,5,42,39,28,44,22 CIDO0018,37,47,13,25,17,6,13,35 CIDO0019,29,27,32,41,6,27,8,34 CIDO0020,31, 11,28,26,35,19,33,6 CIDO0021,26,45,34,50,44,30,32,28 CIDO0022,29,50,18,23,38,24,22,9 CIDO0023,21,38,33,24,16,11,6,18 CIDO0024,13,7,13,37,8,8,20,40 CIDO0025,19,14,38,33,39,32,37,20 CIDO0026,9,32,35,25,27,29,20,37 CIDO0027,12,6,25,10,38,12,35,18 CIDO0028,40,21,8,32,35,35,13,38
Suspect, Sequence
Е,
GCTAAATTTGTTCAGCCAGATGTAGGCTTACAAATCAAGCTGTCCGCTCGGCACGGC
СТАСАСАСGTCGTGTAACTACAAСAGCTAGTTAAТСTGGATATCACСATGACCGAAT
CATAGATTTCGCCTTAAGGAGCTTTACCATGGCTTGGGATCCAATACTAAGGGCTCG
ACCTAGGCGAATGAGTTTCAGGTTGGCAATCAGCAACGCTCGCCATCCGGACGACGG
CTTACAGTTAGTAGCATAGTACGCGATTTTCGGGAAATGAATGAATGAATGAATGAA
TGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAAT
GAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATG
ААTGAATGAATGAATGAATGAATGAATGAATGAATтGTATCTАТСТАТСТАТСТАТСТ
АТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСССGТCА
АСТСАТТСАСАСCGCATCCTTTССTGCCACTGTAАСТAGTCGACTGGGGAACCTCAT
САТССАТАСТСТСССАСАТТАТGCCTCCCAАССТGTTAAGCGTGGCATGCTTGGGA
TTGCATTGATGCTTCTTGGAGAGGACGCTTTCGTTTTGGAGATTACAGGGATCCAAT
TTTATCATCGGTTCGACTCCCGTAACGACTTAGCAGTAAGGGTGCTAGTTCCTGGTT
AGAATCTTAATAAATCACGTCGCTTGGAGCAAGACAAAGATCGTCGTAATGCCAAGT
GCACGACCACCTTCAGACTTGCAGGACCCGTTTTTTCTTTTTTTCTTTTTTTCTTTT
TTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTT
TTCTTTTT
ITTTT
TTTTTCTTTTTT
TTTT
TTTTTTTCTTTTTT
TCTTTTTTTCTTTTI
ITTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTT
CTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTC
TTTTTTTCTCGATAGCTATGCGGTTCAATACAATCTTAACGCAATGCAGCGATGTGG
TTTCGTACACTTAGCATAAAACСССССАСАТТАААTCGATGTACCCGCCCTCTTAGA
CGCCAATTTCAATGCCGAACCTCCGGCGGGTATCTCTGCACTAGGAGAAGTAGCACG
TCGCTGTAGCGAACTCCTAТCGTGAGATAATTTGTAGAGCTGCTCTТАТAАТАСААТ
AGCTCAGATGGATTATTCCATGGACATCCCCGTGCGTTGTTTCGAGGATGGTAGGTG
GAAATTTTGCCAGACCTCTAGTCTTAAACATGGTTGACGTTATAGGCGCTATCTCTT
GCGTCTGGAAGTGTTAATCCGTGAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAG
AAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGA
AAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAA
Transcribed Image Text:Suspect, Sequence Е, GCTAAATTTGTTCAGCCAGATGTAGGCTTACAAATCAAGCTGTCCGCTCGGCACGGC СТАСАСАСGTCGTGTAACTACAAСAGCTAGTTAAТСTGGATATCACСATGACCGAAT CATAGATTTCGCCTTAAGGAGCTTTACCATGGCTTGGGATCCAATACTAAGGGCTCG ACCTAGGCGAATGAGTTTCAGGTTGGCAATCAGCAACGCTCGCCATCCGGACGACGG CTTACAGTTAGTAGCATAGTACGCGATTTTCGGGAAATGAATGAATGAATGAATGAA TGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAAT GAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATG ААTGAATGAATGAATGAATGAATGAATGAATGAATтGTATCTАТСТАТСТАТСТАТСТ АТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСТАТСССGТCА АСТСАТТСАСАСCGCATCCTTTССTGCCACTGTAАСТAGTCGACTGGGGAACCTCAT САТССАТАСТСТСССАСАТТАТGCCTCCCAАССТGTTAAGCGTGGCATGCTTGGGA TTGCATTGATGCTTCTTGGAGAGGACGCTTTCGTTTTGGAGATTACAGGGATCCAAT TTTATCATCGGTTCGACTCCCGTAACGACTTAGCAGTAAGGGTGCTAGTTCCTGGTT AGAATCTTAATAAATCACGTCGCTTGGAGCAAGACAAAGATCGTCGTAATGCCAAGT GCACGACCACCTTCAGACTTGCAGGACCCGTTTTTTCTTTTTTTCTTTTTTTCTTTT TTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTT TTCTTTTT ITTTT TTTTTCTTTTTT TTTT TTTTTTTCTTTTTT TCTTTTTTTCTTTTI ITTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTT CTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTC TTTTTTTCTCGATAGCTATGCGGTTCAATACAATCTTAACGCAATGCAGCGATGTGG TTTCGTACACTTAGCATAAAACСССССАСАТТАААTCGATGTACCCGCCCTCTTAGA CGCCAATTTCAATGCCGAACCTCCGGCGGGTATCTCTGCACTAGGAGAAGTAGCACG TCGCTGTAGCGAACTCCTAТCGTGAGATAATTTGTAGAGCTGCTCTТАТAАТАСААТ AGCTCAGATGGATTATTCCATGGACATCCCCGTGCGTTGTTTCGAGGATGGTAGGTG GAAATTTTGCCAGACCTCTAGTCTTAAACATGGTTGACGTTATAGGCGCTATCTCTT GCGTCTGGAAGTGTTAATCCGTGAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAG AAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGA AAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAA
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 3 steps with 4 images

Blurred answer
Knowledge Booster
Constants and Variables
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education