
来源:百度文库 编辑:中财网 时间:2024/05/09 09:33:54

RSS串烧  RSS目录  RSS首页  推荐地址  数据源  历史资料  手机游戏  合作栏目  RSS数据




89  当遇见好文章可又看不懂时怎么办?
网页快照   类别:相关新闻


88  Information extraction - Wikipedia, the free encyclopedia
网页快照   类别:信息抽取

Information extraction (IE) is a type of information retrieval whose goal is to automatically extract structured or semistructured information from unstructured machine-readable documents.A typical application of IE is to scan a set of documents written in a natural language and populate a database with the information extracted. Current approaches to IE use natural language processing techniques that focus on very restricted domains. For example, the Message Understanding Conference (MUC) is a competition-based conference that focused on the following domains in the past:

87  MySQL安全性指南
网页快照   类别:数据库


86  MySQL高级特性----对比与其他数据库 - MYSQL - 技术天地 - 赛迪网
网页快照   类别:数据库

对于速度的真实比较,以及不断成熟的MySQL基准套件。见10.8 使用你自己的基准。因为没有线程创建开销、一个较小的语法分析器、较少功能和简单的安全性,mSQL应该在下列方面更快些:

85  什么是海量数据挖掘引擎--DoNews.com--IT社区
网页快照   类别:搜索引擎


84  Block-Level Link Analysis - What Does It Mean To You?
网页快照   类别:web数据挖掘

Microsoft s research lab has released a paper in which they discuss a new way to rank web sites. The new method is called :block-level link analysis.

83  VIPS: a VIsion based Page Segmentation Algorithm
网页快照   类别:web数据挖掘

The VIsion-based Page Segmentation (VIPS) algorithm aims to extract the semantic structure of a web page based on its visual presentation. Such semantic structure is a tree structure; each node in the tree corresponds to a block. Each node will be assigned a value (Degree of Coherence) to indicate how coherent of the content in the block based on visual perception, the bigger is the DoC value, the more coherent is the block. The VIPS algo-rithm makes full use of page layout structure. It first extracts all the suitable blocks from the html DOM tree, and then it finds the separators between these blocks. Here, separators denote the hori-zontal or vertical lines in a web page that visually cross with no blocks. Based on these separators, the semantic tree of the web page is constructed. Thus, a web page can be represented as a set of blocks (leaf nodes of the semantic tree). Compared with DOM based methods, the segments obtained by VIPS are much more semantically aggregated. Noisy information, such as navigation, advertisement, and decoration can be easily removed because they are often placed in certain positions of a page. Contents with different topics are distinguished as separate blocks.

82  google电话面试过程
网页快照   类别:相关新闻

因为我申请的是Wireless Developer的职位,他问我是否做过J2ME以及手机应用开发方面的工作。由于没有做过,只好老实的说没有,但做过协议栈方面的开发。他显然对这个不感兴趣,没有多问。接下来的所有时间,我都在回答他给我做的一个算法问题,耗费了40多分钟,最后基本上是他把算法说出来,狂汗。其实,我现在想想,这应该是一个简单的问题,也不知道当时为什么就想不出来,再汗。建议申请开发职位的兄弟一定要打好算法方面的基本功。偶这方面就从来没有系统学习过,很弱。我把他的题目帖出来吧,感兴趣的可以看看已有数组表示了一个文档中的单词出现的位置,输入k个单词,请找出包含改k个单词的最短的位置。比如有其中的三个数组为:hello -> 5 14 19 35 52world -> 11 17 29 40goodbye -> 1 25 63 72后面的数字是该单词在文档中出现的位置,若输入是hello world goodbye的话,最短的位置是什么?

81  :::实施数据挖掘项目考虑的问题:::
网页快照   类别:数据挖掘基础


80  :::数据挖掘应用:::
网页快照   类别:数据挖掘基础

需要强调的是,数据挖掘技术从一开始就是面向应用的。目前,在很多领域,数据挖掘(data mining)都是一个很时髦的词,尤其是在如银行、电信、保险、交通、零售(如超级市场)等商业领域。数据挖掘所能解决的典型商业问题包括:数据库营销(Database Marketing)、客户群体划分(Customer Segmentation

联系合作   在线帮助  技术支持   友情连接   关于我们  
