Data Mining - RGPV 2025 Question Paper

Save as PDF

Opens your browser print dialog — select "Save as PDF" to download.

Roll No.

IT-603 (B) (GS)
B.Tech., VI Semester
Examination, June 2025
Grading System (GS)
Data Mining

Time: Three Hours Maximum Marks: 70

Note:
i) Attempt any five questions.

किन्ही पाँच प्रश्नों को हल कीजिए।

ii) All questions carry equal marks.

सभी प्रश्नों के समान अंक हैं।

iii) In case of any doubt or dispute the English version question should be treated as final.

किसी भी प्रकार के संदेह अथवा विवाद की स्थिति में अंग्रेजी भाषा के प्रश्न को अंतिम माना जायेगा।

(a)

Describe the steps involved in data mining when viewed as a process of knowledge discovery.

नलिखे हिडकावरी की प्रक्रिया के रूप में देखे जाने पर डाटा माइनिंग मे शामिल चरणों का वर्णन करें।

(b)

Compare ROLAP, MOLAP, OLAP operations. Explain different OLAP operations.

ROLAP, MOLAP, OLAP की तुलना करें। अलग-अलग OLAP ऑपरेशंस को समझाइए।

(a)

Explain data transformation strategies in detail.

डाटा ट्रान्सफॉरमेशन स्ट्रेटेजी को विस्तार से समझाइए।

(b)

In real world data, tuples with missing values for some attributes are a common occurrence. Describe various methods for handling this problem.

वास्तविक दुनिया के डाटा में, कुछ विशेषताओं के ��िर्माण दैल्लू वाले टुपल्स एक सामान्य घटना है। इस समस्या से निपटने के विभिन्न तरीकों का वर्णन कीजिए।

(a)

Draw a snowflake schema diagram for "University" data warehouse which consist of four dimensions student, course, semester and instructor and two measures count and avg_grade. At the lowest conceptual level, the avg_grade measure stores the actual course grade of student. At higher level avg_grades store average grade for given schema.

"विश्वविद्यालय" डाटा वेयरहाउस के लिए एक स्नोफ्लेक स्कीमा आरेख बनाइए जिसमें चार आयाम छात्र, पाठ्यक्रम, सेमेस्टर और प्रशिक्षक और दो उपाय गिनती और avg_grade शामिल हैं। निम्नतम वैचारिक स्तर पर, avg_grade माप छात्र के वास्तविक पाठ्यक्रम ग्रेड को संग्रहीत करता है। उच्च स्तर पर दिए गए स्कीमा के लिए avg_grades स्टोर औसत ग्रेड है।

(a)

Give an example to show that items in a strong association rule actually may be negatively correlated.

यह दिखाने के लिए एक उदाहरण दें कि एक स्ट्रोंग एसोसिएशन रूल में आइटम वास्तव में नकारात्मक रूप से सहसंबद्ध (को-रिलेट) हो सकते हैं।

(b)

Describe knowledge discovery process in detail.

नॉलेज डिस्कवरी प्रक्रिया को विस्तार से वर्णन कीजिए।

(b)

Apriori algorithm makes use of prior knowledge of subset support properties. Prove that all non empty subsets of a frequent itemset must also be frequent.

एप्रियोरी एल्गोरिथम सबसेट समर्थन गुणों के पूर्व ज्ञान का उपयोग करता है। साबित करें कि लगातार आइटमसेट के सभी नॉन एम्प्टी उपसमुच्चय भी लगातार होने चाहिए।

(a)

Why is tree pruning useful in decision tree induction? What is a drawback of using a separate set of tuples to evaluate pruning?

ट्री प्रूनिंग, डिसीजन ट्री इंडक्शन में उपयोगी क्यों है? प्रूनिंग का मूल्यांकन करने के लिए टुपल्स के एक अलग सेट का उपयोग करने में क्या कमी है?

(b)

Give an example to show why the k-means algorithm may not find the global optimum, that is optimizing the within cluster variation.

यह दिखाने के लिए एक उदाहरण दें कि k-mean एल्गोरिथम ग्लोबल ऑप्टिमम क्यों ��हीं खोज सकता है, जो क्लस्टर भिन्नता के भीतर अनुकूलन (ऑप्टिमाइज) कर रहा है।

(a)

Clusters may form a hierarchy, outliers may belong to different granularity level. Explain a clustering based outlier detection method that can find outliers at different levels.

क्लस्टर एक पदानुक्रम (हायराखी) बना सकते हैं, आउटलायर विभिन्न ग्रेन्युलैरिटी स्तर से संबंधित हो सकते हैं। एक क्लस्टरिंग आधारित आउटलायर डिटेक्शन विधि की व्याख्या करें जो विभिन्न स्तरों पर आउटलायर का पता लगा सके।

(a)

What is Web mining? Explain its types.

वेब माइनिंग क्या है? इसके प्रकारों को समझाइए।

(b)

Describe the security issues in data mining.

डाटा माइनिंग में सुरक्षा मुद्दों का वर्णन करें।

(a)

Describe Spatial and Temporal mining with suitable example.

स्पेशियल और टेम्पोरल माइनिंग का उपयुक्त उदाहरण सहित वर्णन कीजिए।

(b)

Write K-means algorithm with example.

K-मीन्स एल्गोरिथम को उदाहरण सहित लिखिए।