[Fixed]-Populating a SQLite3 database from a .txt file with Python


So assuming your models.py looks something like this:

class Representative(models.Model):
    parliament = models.CharField(max_length=128)
    name = models.CharField(max_length=128)
    country = models.CharField(max_length=128)
    party_group = models.CharField(max_length=128)
    national_party = models.CharField(max_length=128)
    position = models.CharField(max_length=128)

You can then run python manage.py shell and execute the following:

import csv
from your_app.models import Representative
# If you're using different field names, change this list accordingly.
# The order must also match the column order in the CSV file.
fields = ['parliament', 'name', 'country', 'party_group', 'national_party', 'position']
for row in csv.reader(open('your_file.csv')):
    Representative.objects.create(**dict(zip(fields, row)))

And you’re done.

Addendum (edit)

Per Thomas’s request, here’s an explanation of what **dict(zip(fields,row)) does:

So initially, fields contains a list of field names that we defined, and row contains a list of values that represents the current row in the CSV file.

fields = ['parliament', 'name', 'country', ...]
row = ['7', 'Marta Andreasen', 'United Kingdom', ...]

What zip() does is it combines two lists into one list of pairs of items from both lists (like a zipper); i.e. zip(['a','b,'c'], ['A','B','C']) will return [('a','A'), ('b','B'), ('c','C')]. So in our case:

>>> zip(fields, row)
[('parliament', '7'), ('name', 'Marta Andreasen'), ('country', 'United Kingdom'), ...]

The dict() function simply converts the list of pairs into a dictionary.

>>> dict(zip(fields, row))
{'parliament': '7', 'name': 'Marta Andreasen', 'country': 'United Kingdom', ...}

The ** is a way of converting a dictionary into a keyword argument list for a function. So function(**{'key': 'value'}) is the equivalent of function(key='value'). So in out example, calling create(**dict(zip(field, row))) is the equivalent of:

create(parliament='7', name='Marta Andreasen', country='United Kingdom', ...)

Hope this clears things up.


As SiggyF says and only slightly differently than Joschua:

Create a text file with your schema, e.g.:

CREATE TABLE politicians (
    Parliament text, 
    Name text, 
    Country text, 
    Party_Group text, 
    National_Party text, 
    Position text

Create table:

>>> import csv, sqlite3
>>> conn = sqlite3.connect('my.db')
>>> c = conn.cursor()
>>> with open('myschema.sql') as f:            # read in schema file 
...   schema = f.read()
>>> c.execute(schema)                          # create table per schema 
<sqlite3.Cursor object at 0x1392f50>
>>> conn.commit()                              # commit table creation

Use csv module to read file with data to be inserted:

>>> csv_reader = csv.reader(open('myfile.txt'), skipinitialspace=True)
>>> csv_reader.next()                          # skip the first line in the file
['Parliament', 'Name', 'Country', ...

# put all data in a tuple
# edit: decoding from utf-8 file to unicode
>>> to_db = tuple([i.decode('utf-8') for i in line] for line in csv_reader)
>>> to_db                                      # this will be inserted into table
[(u'7', u'Marta Andreasen', u'United Kingdom', ...

Insert data:

>>> c.executemany("INSERT INTO politicians VALUES (?,?,?,?,?,?);", to_db)
<sqlite3.Cursor object at 0x1392f50>
>>> conn.commit()

Verify that all went as expected:

>>> c.execute('SELECT * FROM politicians').fetchall()
[(u'7', u'Marta Andreasen', u'United Kingdom', ...

And since you’ve decoded (to unicode) on input, you need to be sure to encode on output.
For example:

with open('encoded_output.txt', 'w') as f:
  for row in c.execute('SELECT * FROM politicians').fetchall():
    for col in row:


You could read the data using the csv module. Then you can create an insert sql statement and use the method executemany:

  cursor.executemany(sql, rows)

or use add_all if you use sqlalchemy.



You asked what the create(**dict(zip(fields, row))) line did.

I don’t know how to reply directly to your comment, so I’ll try to answer it here.

zip takes multiple lists as args and returns a list of their correspond elements as tuples.

zip(list1, list2) => [(list1[0], list2[0]), (list1[1], list2[1]), …. ]

dict takes a list of 2-element tuples and returns a dictionary mapping each tuple’s first element (key) to its second element (value).

create is a function that takes keyword arguments. You can use **some_dictionary to pass that dictionary into a function as keyword arguments.

create(**{‘name’:’john’, ‘age’:5}) => create(name=’john’, age=5)



Something like the following should work: (not tested)

# Open database (will be created if not exists)
conn = sqlite3.connect('/path/to/your_file.db')

c = conn.cursor()

# Create table
c.execute('''create table representatives
(parliament text, name text, country text, party_group text, national_party text, position text)''')

f = open("thefile.txt")
for i in f.readlines():
    # Insert a row of data
    c.execute("""insert into representatives
                 values (?,?,?,?,?,?)""", *i.split(", ")) # *i.split(", ") does unpack the list as arguments

# Save (commit) the changes

# We can also close the cursor if we are done with it


If you want to do it with a simple method using sqlite3, you can do it using these 3 steps:

$ sqlite3 db.sqlite3
sqlite> .separator ","
sqlite> .import myfile.txt table_name

However do keep the following points in mind:

  1. The .txt file should be in the same directory as your db.sqlite3,
    otherwise use an absolute path "/path/myfile.txt" when importing
  2. Your schema for tables (number of columns) should match with the number of values seperated by commas in each row in the txt file

You can use the .tables command to verify your table name

SQLite version 3.23.1 2018-04-10 17:39:29
Enter ".help" for usage hints.
sqlite> .tables
auth_group                  table_name               
auth_group_permissions      django_admin_log          
auth_permission             django_content_type       
auth_user                   django_migrations         
auth_user_groups            django_session            

Leave a comment