admin管理员组

文章数量:1530255

  今天更新数据仓库高频面试题英文版,分为三个部分。下面是第一部分。
音频文件点击下方获取。
【数仓】数据仓库高频面试题题英文版(1)
【数仓】数据仓库高频面试题题英文版(2)
【数仓】数据仓库高频面试题题英文版(3)

What is Data Warehouse?

Data warehousing (DW) is a method of gathering and analysing data from many sources in order to get useful business insights. Typically, a data warehouse is used to integrate and analyse corporate data from many sources. The data warehouse is the heart of the business intelligence (BI) system, which is designed to analyse and report on data.

数据仓库 (DW) 是一种从多个来源收集和分析数据以获得有用的业务洞察力的方法。通常,数据仓库用于集成和分析来自多个来源的公司数据。数据仓库是商业智能 (BI) 系统的核心,旨在分析和报告数据。

It is a collection of technology and components that help with data strategy. It refers to a company’s electronic storage of a huge volume of data that is intended for inquiry and analysis rather than transaction processing. It is a method of converting data into information and making it available to people in a timely manner so that it can be used to make a difference. It is created by combining data from a variety of disparate sources to provide analytical reporting, structured and/or ad hoc queries, and decision-making. Cleaning, integrating, and consolidating data are all part of data warehousing.

它是有助于数据战略的技术和组件的集合。它是指公司以电子方式存储大量数据,用于查询和分析,而不是交易处理。它是一种将数据转换为信息并及时提供给人们的方法,以便可以用来产生影响。它是通过组合来自各种不同来源的数据来创建的,以提供分析报告、结构化和/或即席查询以及决策制定。清理、集成和整合数据都是数据仓库的一部分。

The Data Warehouse is kept distinct from the operational database of the company. It is an environment rather than a product. It is an information system’s architectural design that gives users access to current and historical decision-support data that’s difficult to find or find in a standard operational data store.

数据仓库与公司的运营数据库不同。它是一种环境,而不是一种产品。它是一种信息系统的架构设计,使用户能够访问在标准操作数据存储中很难找到的当前和历史决策支持数据。

For example, a report on current inventory information may have more than 12 connected conditions. This can cause the query and report to take a long time to respond. A data warehouse introduces a novel design that can help to improve query performance and minimise response time for reporting and analytics.

例如,一份关于当前库存信息的报告可能有 12 个以上的关联条件。这可能会导致查询和报告需要很长时间才能响应。数据仓库引入了一种新颖的设计,可以帮助提高查询性能并最大限度地减少报告和分析的响应时间。

The following are some alternative names for the data warehouse system:

  • Decision Support System (DSS).
  • Management Information System.
  • Executive Information System.
  • Analytic Application.
  • Business Intelligence Solution.

以下是数据仓库系统的一些替代名称:

  • 决策支持系统 (DSS)
  • 管理信息系统
  • 行政信息系统
  • 分析应用
  • 商业智能解决方案

Data Warehouse Interview Questions for Freshers

1. What do you mean by data mining? Differentiate between data mining and data warehousing.

Data mining is the process of collecting information in order to find patterns, trends, and usable data that will help a company to make data-driven decisions from large amounts of data. In other words, Data Mining is the method of analysing hidden patterns of data from various perspectives for categorization into useful data, which is gathered and assembled in specific areas such as data warehouses, efficient analysis, data mining algorithm, assisting decision making, and other data requirements, ultimately resulting in cost-cutting and revenue generation. Data mining is the process of automatically examining enormous amounts of data for patterns and trends that go beyond simple analysis. Data mining estimates the probability of future events by utilising advanced mathematical algorithms for data segments.

Following are the differences between data warehousing and data mining:

Data Warehousing Data Mining
A data warehouse is a database system that is intended for analytical rather than transactional purposes. The technique of examining data patterns is known as data mining.
In data warehousing, data is saved on a regular basis. In data mining, data is evaluated on a regular basis.
Engineers are the only ones that do data warehousing. With the assistance of technologists, business users conduct data mining.
Data warehousing is the process of bringing all relevant data together. Data mining is the process of extracting information from big datasets.
Data warehousing can be referred to as a subset of data mining. Data Mining can be referred to as a super set of data warehousing.

1. 数据挖掘是什么意思?区分数据挖掘和数据仓库。

数据挖掘是收集信息以发现模式、趋势和可用数据的过程,这些数据将帮助公司从大量数据中做出数据驱动的决策。换句话说,数据挖掘是从不同的角度分析数据的隐藏模式,分类成有用的数据,在数据仓库、高效分析、数据挖掘算法、辅助决策等特定领域进行收集和组装的方法。数据需求,最终导致成本削减和创收。数据挖掘是自动检查大量数据以寻找超出简单分析的模式和趋势的过程。数据挖掘通过对数据段使用高级数学算法来估计未来事件的概率。

以下是数据仓库和数据挖掘之间的区别:

数据仓库 数据挖掘
数据仓库是用于分析而非事务目的的数据库系统。 检查数据模式的技术称为数据挖掘。
在数据仓库中,数据会定期保存。 在数据挖掘中,数据被定期评估。
工程师是唯一做数据仓库的人。 在技术人员的帮助下,业务用户进行数据挖掘。
数据仓库是将所有相关数据汇集在一起的过程。 数据挖掘是从大数据集中提取信息的过程。
数据仓库可以被称为数据挖掘的一个子集。 数据挖掘可以称为数据仓库的超集。

2. What do you mean by OLAP in the context of data warehousing? What guidelines should be followed while selecting an OLAP system?

OLAP is an acronym for On-Line Analytical Processing. OLAP is a software technology classification that allows analysts, managers, and executives to get insight into information through quick, reliable, interactive access to data that has been converted from raw data to reflect the true dimensionality of the company as perceived by the clients. OLAP allows for multidimensional examination of corporate data while also allowing for complex estimations, trend analysis, and advanced data modelling. It’s rapidly improving the foundation for Intelligent Solutions, which includes Business Performance Management, Strategy, Budgeting, Predicting, Financial Documentation, Analysis, Modeling, Knowledge Discovery, and Data Warehouses Reporting. End-clients can use OLAP to perform ad hoc record analysis in several dimensions, giving them the information and understanding they need to make better choices.

Following guidelines must be followed while selecting an OLAP system:

2. 在数据仓库的上下文中,OLAP 是什么意思?选择 OLAP 系统时应遵循哪些准则?

本文标签: 英文版数据仓库面试题