請教一些面試問題

看板Oversea_Job (海外工作)作者時間17年前 (2007/08/23 14:11), 編輯推噓0(000)
留言0則, 0人參與, 最新討論串1/8 (看更多)
第一道題: Given N computers networked together, with each computer storing N integers, finds the "median" of all of the numbers. Assuming a computer can hold O(N) integers. Also assume that there exists a computer on the network without integers, that we can use to interface with the other computers storing the integers. 以下是我想的解法: 1. Sort each of piles of numbers separately in an array - 題目有講memory O(N), 所以假設in memory sort: qsort - O(N * log N) radix sort - O(N) 問題來了 請問可以作radix sort嗎? 沒學過這個sort 只是聽說它可以O(N) 2. Use the extra computer as buffer, take the smallest number in each computers Construct a heap, each node has info: <data, source of data> - O(N * logN) time, O(N) space 3. Take out root node from heap, peek the source, then get a new number from that source computer - logN Insert new number into heap - logN Do it (N * N) / 2 -1 times, the root node's value is the answer - O(N^2 * logN) Total: O(N * logN) + O(N * logN) + O(N^2 * logN) = O(N^2 * logN) 不知到還有沒有人有更棒的解法?? 好想知道啊~ 第二道題: How to fast check if a URL is visited by web crawler? 我看到的解法: hash table (有這麼簡單嗎@@) 直覺上來說好像不對勁 一個URL假設是20 char, 算20 bytes 假設Internet有5 billion pages -> 5 * 20 billion bytes = 100 billion bytes = 100 GB 100GB(至少) hastable? 有沒搞錯? 我查了一下wikipedia 上面也是說Google有個URL server專門在作這個URL revisit check 請問真的是用Hashing嗎 還是Distributed Hashing?? -- ※ 發信站: 批踢踢參(ptt3.cc) ◆ From: 141.158.245.93
文章代碼(AID): #16pIKQ00 (Oversea_Job)
文章代碼(AID): #16pIKQ00 (Oversea_Job)