請教一些面試問題
第一道題:
Given N computers networked together, with each computer storing N integers,
finds the "median" of all of the numbers. Assuming a computer can hold O(N)
integers. Also assume that there exists a computer on the network without
integers, that we can use to interface with the other computers storing the
integers.
以下是我想的解法:
1. Sort each of piles of numbers separately in an array -
題目有講memory O(N), 所以假設in memory sort:
qsort - O(N * log N)
radix sort - O(N)
問題來了 請問可以作radix sort嗎? 沒學過這個sort 只是聽說它可以O(N)
2. Use the extra computer as buffer, take the smallest number in each
computers
Construct a heap, each node has info: <data, source of data>
- O(N * logN) time, O(N) space
3. Take out root node from heap, peek the source, then get a new number
from that source computer - logN
Insert new number into heap - logN
Do it (N * N) / 2 -1 times, the root node's value is the answer
- O(N^2 * logN)
Total: O(N * logN) + O(N * logN) + O(N^2 * logN) = O(N^2 * logN)
不知到還有沒有人有更棒的解法?? 好想知道啊~
第二道題:
How to fast check if a URL is visited by web crawler?
我看到的解法: hash table (有這麼簡單嗎@@)
直覺上來說好像不對勁
一個URL假設是20 char, 算20 bytes
假設Internet有5 billion pages -> 5 * 20 billion bytes = 100 billion bytes
= 100 GB
100GB(至少) hastable? 有沒搞錯?
我查了一下wikipedia 上面也是說Google有個URL server專門在作這個URL revisit
check
請問真的是用Hashing嗎 還是Distributed Hashing??
--
※ 發信站: 批踢踢參(ptt3.cc)
◆ From: 141.158.245.93
討論串 (同標題文章)
Oversea_Job 近期熱門文章
PTT職涯區 即時熱門文章
69
194