{"id":2121,"date":"2015-02-26T23:50:06","date_gmt":"2015-02-26T23:50:06","guid":{"rendered":"http:\/\/41j.com\/blog\/?p=2121"},"modified":"2015-02-25T00:00:00","modified_gmt":"2015-02-25T00:00:00","slug":"select-random-line-file-single-pass","status":"publish","type":"post","link":"https:\/\/41j.com\/blog\/2015\/02\/select-random-line-file-single-pass\/","title":{"rendered":"Select random line from a file in a single pass"},"content":{"rendered":"<p>I was asked this in an interview a few years back and I though it was kind of neat. The question is how do you select a single line from a file randomly, such that each line has a equal probability of being selected. The naive solution is of course to count the number of lines in the file, and then to select a line and get it from the file. But can you do it in a single pass, without seeking through the file?<\/p>\n<p>The solution is kind of neat, have a buffer to store a &#8220;selected&#8221; line from the file. Then you read each line in turn, you then randomly swap your &#8220;selected&#8221; line with the new line with decreasing probability. The probability decreases to compensate for the fact that previous lines are more likely to have been replaced already.<\/p>\n<p>Here&#8217;s a C++ solution:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nstring selected_line;\r\nfor(size_t linenumber=0;!file.eof();linenumber++) {\r\n\r\n  string s = file.getline();\r\n\r\n  int64_t r = rand64();\r\n  r = r%linenumber;\r\n  if(r==0) {\r\n    selected_line = s;\r\n  }\r\n}\r\ncout &lt;&lt; selected_line &lt;&lt; endl;\r\n<\/pre>\n<p>You can read more about this <a href=\"http:\/\/docstore.mik.ua\/orelly\/perl3\/cookbook\/ch08_07.htm\">here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I was asked this in an interview a few years back and I though it was kind of neat. The question is how do you select a single line from a file randomly, such that each line has a equal probability of being selected. The naive solution is of course to count the number of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[1],"tags":[],"class_list":["post-2121","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p1RRoU-yd","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/posts\/2121","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/comments?post=2121"}],"version-history":[{"count":2,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/posts\/2121\/revisions"}],"predecessor-version":[{"id":2125,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/posts\/2121\/revisions\/2125"}],"wp:attachment":[{"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/media?parent=2121"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/categories?post=2121"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/tags?post=2121"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}