You have more flexibility to implement your own function names and logic in these program

Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
icon
Related questions
Question
  1. You have more flexibility to implement your own function names and logic in these programs.

  2. The data files you need for this assignment can obtained from:

    Right-click and "Save Link As..."
    • HUGO_genes.txt (Links to an external site.)
    • chr21_genes.txt (Links to an external site.)
  3. Create an output directory inside your assignment4 directory called "OUTPUT" for result files, so that they will not mix with your programs. Output from your programs will be here!

  4. Your program must implement command line options for the infiles it must open, but for testing purposes it should run by default, so if no command line option is passed at the command line, the program will still run. This will help in the grading of your program.

  5. Create a Python Module called called my_io.py - You can use the same code from assignment 3's solution. Put this my_io.py Module in a subdirectory named assignment4 inside your assignment4 top-level directory (see the tree below, and see Lecture 7 on how to implement a Python module and a package). Anytime a file needs to be opened (read or write) in your programs in this assignment, the program should call on this module's function get_fh. You can then use my_io.get_fh by doing this at the top of your programs:

    from assignment4 import my_io

    # I can then use the module's get_fh() function by:
    fh_in = my_io.get_fh(infile1, "r")
    # note the function call "myio.get_fh()"


    from assignment4.my_io import get_fh

    # I can then use the module's get_fh() function by:
    fh_in = get_fh(infile1, "r") # note the function call "get_fh()"


    assignment4 ├── assignment4 | |── __init__.py # This file can be empty, and it's named __init__.py, you can do this at the terminal: touch assignment4/__init__.py | └── my_io.py ├── HUGO_genes.txt ├── OUTPUT # your output will go here (you can manually make this, i.e. your program does not need to do this │   ├── categories.txt │   └── intersection_output.txt ├── categories.py ├── chr21_gene_names.py ├── chr21_genes.txt ├── chr21_genes_categories.txt └── intersection.py └── tests ├── __init__.py └── unit ├── __init__.py └── test_my_io.py # make sure to write this test, see solution to assignment 3

Information on Source files

  • The chr21_genes.txt file lists genes from human chromosome 21, in their order along the chromosome, as described inHattori et al. (Nature 405, 311-319) (Links to an external site.). For each gene, the file gives the gene symbol, description and category. The fields are separated by tabs. You will need to get the the meaning of each category. You can find these meanings in the original paper (Links to an external site.), under the "Gene categories" section. Create a file named chr21_genes_categories.txt that store this information in tab separated fields:1.1 Genes with 100% identity over a complete cDNA with defined functional association (for example, transcription factor, kinase). 1.2 Genes with 100% identity over a complete cDNA corresponding to a gene of unknown function (for example, some of the KIAA series of large cDNAs). .. .. .. .. (you will fill in the rest) This will be used in program #2

  • The HUGO_genes.txt file lists all human genes having official symbol approved by the HUGO gene nomenclature committee (Links to an external site.)(some have probably changed by now). For each gene, the file gives its symbol and description, separated by a TAB character.

Exercises

You must solve exercises 1 and 2 by using Dictionaries, and exercise 3 using Lists or Sets
  1. Write a program (call it chr21_gene_names.py) that asks the user to enter a gene symbol and then prints the description for that gene based on data from the chr21_genes.txt file. The program should give an error message if the entered symbol is not found in the table (the user should should not have to worry about case, i.e. it will be a case-insensitive search). The program should continue to ask the user for genes until "quit" or "exit" is given (case-insensitive). Make sure to prompt the user to enter the quit to end the program. Use Dictionaries to solve this problem. HINT: Feel free to use as Dictionary of Dictionaries, but it is not required.

    HINT: First read the entire text file into a Dictionary that maps the association between gene symbol and description. Once again, make sure to use a Dictionary.

    Remember to have these command line options:

    $ python3 chr21_gene_names.py -h usage: chr21_gene_names.py [-h] -i INFILE Open chr21_genes.txt, and ask user for a gene name optional arguments: -h, --help show this help message and exit -i INFILE, --infile INFILE Path to the file to open

    $ python3 chr21_gene_names.py -i chr21_genes.txt

    Output from this program should just go to <STDOUT>:Enter gene name of interest. Type quit to exit: TPTE TPTE found! Here is the description: tensin, putative protein-tyrosine phosphatase, EC 3.1.3.48. Enter gene name of interest. Type quit to exit: TPTTTT
    Not a valid gene name.
    Enter gene name of interest. Type quit to exit: qUiT
    Thanks for querying the data.
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Recommended textbooks for you
Computer Networking: A Top-Down Approach (7th Edi…
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
Computer Organization and Design MIPS Edition, Fi…
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
Concepts of Database Management
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
Prelude to Programming
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
Sc Business Data Communications and Networking, T…
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY