For testing purposes, it is very useful sometimes to generate some random content in a mysql table. Such table simulates a real table with content generated in long time. For instance, suppose you have a mysql table with comments from users. Each day, users post a number of comments. Thus, i a year or so, there have been accumulated many comments (let’s say, about 100.000 comments). But you would like to test some database queries in a real environment before users can post their comments. So, this post will explain how to generate a big table with comments (automatically generated) using mysql stored routines.
First, I’ve got an English dictionary with words, which I’ve imported into a simple MySQL table with structure described below:
1 2 3 4 5 6 7 8 9 10 | CREATE TABLE `dict_en` ( `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `word` VARCHAR(50) NOT NULL DEFAULT '', PRIMARY KEY (`id`), INDEX `word` (`word`) ) COMMENT='English dictionary' COLLATE='utf8_general_ci' ENGINE=MyISAM ROW_FORMAT=DEFAULT; |
Second, I’ve created a simple table that will host so-called articles (randomly created):
1 2 3 4 5 6 7 8 9 | CREATE TABLE `articles` ( `id` BIGINT(20) NOT NULL AUTO_INCREMENT, `content` MEDIUMTEXT NULL, PRIMARY KEY (`id`) ) COMMENT='articole' COLLATE='utf8_general_ci' ENGINE=MyISAM ROW_FORMAT=DEFAULT; |
Now, you should import a simple text file with dictionary data, each word on a single line. This could be done with LOAD DATA INFILE command like this:
1 2 3 4 | LOAD DATA INFILE 'my_dictionary_file.txt' INTO TABLE dict_en FIELDS TERMINATED BY "\t" LINES TERMINATED BY "\n" (`word`); |
At this moment, we use the dictionary table to generate some variable-width random generated text that simulated a comment. For this example, let’s generate a comment with a random number of words between 1 and 1000, words separated by SPACE. This can be accomplished with MySQL function described below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | CREATE FUNCTION `create_content_by_wordlist`() RETURNS mediumtext LANGUAGE SQL DETERMINISTIC COMMENT 'Creates a text with random words from dictionary table' BEGIN DECLARE retVal MEDIUMTEXT DEFAULT ''; DECLARE iter INT(10) DEFAULT 0; DECLARE randLength INT(10) DEFAULT 0; SET randLength = FLOOR(1000 * rand()); SET iter=0; SET retVal = ''; SELECT COUNT(`id`) INTO @cntWL FROM dict_en; WHILE (iter<randLength) DO SET @curId = FLOOR(@cntWL * RAND()); SELECT word INTO @wrd FROM dict_en WHERE `id`=@curId LIMIT 1; SET retVal = CONCAT(retVal,' ', @wrd); SET iter = iter+1; END WHILE; RETURN retVal; END |
Now it’s time to fill the comments table, which is named “articles” for simplicity. Next stored routine starts filling that table with content. It accepts a single parameter: number of comments (or articles):
1 2 3 4 5 6 7 8 9 10 11 12 13 | CREATE PROCEDURE `create_articles`(IN `_records` INT) LANGUAGE SQL NOT DETERMINISTIC COMMENT 'Fill articles table' BEGIN DECLARE contor INT(10) DEFAULT 0; SET contor = 0; # TRUNCATE TABLE articles; WHILE (contor<_records) DO INSERT INTO articles SET `content`=create_content_by_wordlist(); SET contor = contor+1; END WHILE; END |
All prepared now! So, let’s put those stored routines to work. All you have to do is just create table for dictionary, fill it with data (I’ve used a text file with 350k english words). Finally, put mysql to hard work: run create_articles() procedure:
1 | CALL create_articles(10000); |
Be aware, a greater number of articles make articles table bigger. In my situation, for 250k articles randomly generated, the articles table has about 5G in size. It is a good situation for testing large MySQL tables. You could use indexes, partitions or other engines to test different study cases.

