
Question:
I have in one database rough 2,000 movies with genres store like this:
MOVIE \ ID \ Genre
Avatar \ 1 \ action ^ adventure ^ scifi
matrix \ 2 \ action ^ scifi
and I was trying to make a system that would sort by genre, and when I researched it I found that I needed a table that would connect the two. (see this question: <a href="https://stackoverflow.com/questions/5423893/mysql-select-genres-problem-php" rel="nofollow">MySQL select genres problem. (php)</a>)
So I know how to implement this, I just dont know how I could transfer the genres into a seperate table.
The only way I could come up with is to go row by row and grab the genre column and explode the text, get the array and test for different genres, but that seemed too much of a hassle and difficult to do on 2k+ rows.
Thanks!
Answer1:The answer you quote in your question seems to have nailed the database strutcure you require. So really it is about replicating this.
Firstly I would create the Genre
table. Again much like in the reference quoted the following structure will suffice.
+------------+---------+
| id | genre |
+------------+---------+
| 1 | action |
+------------+---------+
| 2 | drama |
+------------+---------+
and so on. Once you have created the genre table. You will need to create your movie_genre table where the genres will be stored. So for example
+----------+-----------+---------+
| id |movie_id |genre_id |
+----------+-----------+---------+
The following structure. Again like the reference quoted.
Once you have the database structure ready, the next thing would be to create a simple switch function. For example
public function returnGenId($genre)
{
switch($genre)
{
case "action":
return 1;
break;
case "drama"
return 2;
break;
}
}
Obviously the return will match with the id of the genre in the genre table.
You can now get everything from your database by doing
SELECT * from movies
Which should leave you with an array similar to this
array(
[0]=>array(
[id]=>1
[movie]=>Avatar
[genre]=>action ^ adventure ^ scifi
)
)
Once you have your data then just loop through, exploding the genre and creating your new array for example
foreach($results as $key=>$result)
{
$eplosion = explode(' ^ ',$result['genre'];
foreach($explosion as $exploded)
{
$genres[] = returnGenId($exploded);
}
$data[$result['id']] = array('movie'=>$result['movie'],'genres'=>$genres);
}
This should leave with an array like so (as an example, depending on the switch statement)
array(
[1]=>array(
[movie]=>Avatar
[genres]=>array(
[1]=>1
[2]=>3
)
)
)
Now you can loop through the data as put it into the relevant database
foreach($data as $key=>$film)
{
foreach($film['genre'] as $dbGenre)
{
}
}
In the second foreach do a mysql insert into movie_genre using $key
as the movie_id
and then $dbGenre
as the genre_id
. By doing this one script you can sort your database tables out and structure them correctly. Although this can be a hassle. Once you have got it out of the way, things will be infinitely easier. Let me know how you get on
<em><strong>Update</strong></em>
I've run these kind of scripts on databases with a couple of hundred thousand rows, so I can't see an issue for a couple of thousand. But this also depends on the memory size set on your server.
Increasing your memory limit may be an idea if need be, see this link:
<a href="http://davidwalsh.name/increase-php-memory-limit-ini_set" rel="nofollow">http://davidwalsh.name/increase-php-memory-limit-ini_set</a>
Another idea is to hash out the last foreach loop and do a print of the data ie
print_r($data);
/*
foreach($data as $key=>$film)
{
foreach($film['genre'] as $dbGenre)
{
}
}
/*
Before executing the script add this as close to the very top (preferably the first line)
$start = time();
Then add this to the bottom. Again as close the to the bottom of the script.
$end = time();
$wait = $end – $start;
echo 'The script took ' . $wait . ' seconds to run';
This will show you the amount of time taken.
Always before messing with the database though, dump the database for safety. Maybe even clone it and run this on a test database for peace of mind.
Answer2:I think you can create a new table for genres first than you might use SUBSTRING_INDEX(field, '^', 1) in your query to select genres DISTINCT and write it to your new DB table in a single query.(INSERT INTO .... (SELECT ...). Than all you need to do is relate your new table of genres to your movies table. <a href="https://stackoverflow.com/questions/11241492/mysql-select-first-element-of-a-comma-separated-list" rel="nofollow">Here</a> is a link for SUBSTRING_INDEX.