Welcome, Guest. Please login or register.
Importing Existing Collections [In Progress]

 On this page:


Introduction


PVD includes import plugins for Excel files and text files in CSV format. Regardless of where the information comes from, if it can be arranged in the necessary format, it can be imported. The best tool for doing that is a spreadsheet. If you have Excel, use that and the Excel import plugin. If you don't, use any other spreadsheet application to prepare the data, save it in CSV format and import it using the CSV plugin.

As for plugins that import data from other applications, there is only one—the Ant Movie Catalog plugin (download). So this topic will focus on the universal method of importing data from any source—using a spreadsheet as an intermediate step.

For the purpose of this topic, "existing collection" means anything which is—or can be converted to—a list of movie titles, which may or may not include information in addition to the titles. Even if it's just a simple list of titles, importing that list will be much easier than entering them manually, one-at-a-time.


Identifying Data Sources


If you don't have an existing "database," there are many possible sources of movie lists you should consider. If you do have an existing database, you may want to add additional movies from such sources to your import. Even those will established PVD databases may find it useful to add movies from lists. Additional movies can be imported at any time.

Here's a list of sources to consider:

  • An existing database from another video database application. Virtually all of these will have the ability to export Excel or CSV files. If they don't, they probably have the ability to generate a printed report of some kind. This can be captured and imported into a spreadsheet for preparation for import.

  • Before letting our imagination run wild (with the rest of this list), there's one special source to consider—video files on your HDD. If those are your collection (i.e., you're not interested in videos for which you do not have such media) and you have no other information about them to capture, then you don't need to import anything. Instead, you may rely on *Scanning Media to add those to your database. You may still add movies this way, however, and then scan the files later to match them to the movies added.

  • This one should be obvious by now, but we'll add it to the list anyway—any spreadsheet listing you may have. Maybe you made an inventory of the DVD's you own, or have been keeping a list of movies seen. You would not only import the items from such lists, but whatever the list means as well. From your inventory, that might be Media label and/or Location. If your list includes neither of those, you might set one of them to "DVD"—just to distinguish them from movie records for which you don't own a DVD. From your list of movies seen, that might include the Seen date. If you list did not include the date, you would at least use it to set the Seen flag.

  • A list in any other electronic format that can be easily converted to a spreadsheet. If your inventory or movies seen lists were Word documents, for example, you could cut & paste the contents to Excel. Excel can also import data from text files of many different formats and layouts.

  • If you're not an accountant, it's perhaps unlikely you have an "inventory" lying around. But you may have used one or more of the growing number of websites that allow users to create lists of movies. Even if you haven't, using one to create a list of the movies you want in your database may be the easiest way to get started. All that's required is the site display your list in a form you can cut & paste into your spreadsheet.

  • An example of this is the IMDb My Movies service. With this service, any IMDb movie page is easily added to your personal "My Movies" list. The resulting list can be cut & paste into a spreadsheet. The items in the list are links to the movie pages. These can be extracted and imported, so the IMDb plugin will be able to access and download the data directly—without having to search for the title.

  • The IMDb can also generate many different lists, any of which may be cut & paste into your spreadsheet. It would take only minutes to compile a list of the IMDb Top 250, every Best Picture Oscar winners, Sci-Fi Feature films Released 1960 or later with User rating greater than 8 and more than 1,000 Votes, etc. 8)

  • You may wish to incorporate lists of movies included in other collections. For example, The Criterion Collection is "a continuing series of important classic and contemporary films on home video." Or 1001 Movies You Must See Before You Die. Such lists may serve two purposes. First, to simply record the fact that some of the movies otherwise in your collection are included in these lists. Do this by importing "Criterion" or "1001 Movies" into a custom Recommended by field. Second, to provide a "Wish List" of movies to consider viewing or acquiring. This might seem like a good way to create an unwieldy wish list, but consider this: After importing information about these movies, you may use Advanced search to select movies you're not interested in, and delete them. Depending on your interests and the lists available, this may be an excellent way to create a personalized list of movies you must see before you die. ;)

  • It's difficult to fathom how it happens, but some people manage to keep important information in printed form, but lose the electronic version. If this is you and your painstakingly created list of 1001 movies you've already seen in your life, don't despair. Scan and OCR that list to capture the data. The results will not be 100% accurate, but much better than entering all those movies manually.



Preparing Data for Import


Review of Excel use and tips.


Using the Data Import Plugins



The plugin is not dependent on headings in the Excel file. As long as the configuration has the correct sequence of columns specified, it will work even if there is no heading row. On the other hand, it does recognize the first row is headings, if that's what they are. It's helpful to be able to view descriptive headings while entering the configuration—where the fields must be listed in exactly the same order as the columns in Excel.



When importing additional information to existing records, the following confirmation dialog appears. It may seem straightforward, but can be a little confusing when one appreciates the import works on a field level, not just for each record as a whole. This is what each option does:

  • Update adds source (Excel) field if destination record exists (as determined by matching ID, Title/Year or URL); otherwise does nothing.
  • Merge adds source (Excel) field if destination field (PVD) is null; otherwise leaves destination data intact.
  • Add adds all the source fields to a new duplicate record; the existing record is unaffected.
  • Do not add skips the record; continues the import with the next record.
  • Overwrite replaces the destination field with the source field; if the source field is null, the destination field is left unchanged.
  • Cancel terminates the import.



This dialog will appear for each record already in the database unless "do not show again is checked"—then the response is applied to all records imported.

It's easy to assume all records will be "updated" when choosing Overwrite. It can be important to recognize blank fields in import records will not cause existing data to be "deleted." If this is what is intended, blank cells in Excel should be changed to "null" or "delete me," which can be deleted in PDV after importing (if available for the field, using multiple movie editor).


Case Study: Importing the IMDb Top 250




Cut & paste this into Excel, and create...







Comments:
rick.ca made the following comment on September 30, 2010, 08:55:26 am:

Reserved

deazo made the following comment on October 01, 2010, 08:06:51 am:

Rick, correct me if I am wrong but if I remember well, when I imported an existing collection into another one using CSV import, the "date added" field was not captured, and all "date added" values became the date when I imported the collection.
Even if choosing to overwrite.

rick.ca made the following comment on October 01, 2010, 10:51:08 am:

If a record in the import file doesn't exist in the database, then it's going to be added and Date added set to today. To set it to a value specified in the import file, run the import again. The second time, Date added will be set to the import value, and not set to today—because the record is only being updated, not added.

Since the import works on the field level, it can be run any number of times to add, change or rearrange data. That's a big help when creating a new database from an existing collection, and you're not too sure on the exact design or content of the database. In this case, the answer is to just run the import again. In another situation, it may be to add another column of data to the import file, map that to a new custom field, and run the import again to add the additional data.

deazo made the following comment on October 03, 2010, 08:22:33 pm:

I didn't know that. Good thing I kept the old database, I can re-run the import to change the date, nice one!

Powered by MySQL Powered by PHP Valid XHTML 1.0! Valid CSS!