Welcome, Guest |
You have to register before you can post on our site.
|
Latest Threads |
What is the maximum numbe...
Forum: BDB Designer Q & A
Last Post: sariga.vr@bdb.ai
12-28-2022, 07:59 AM
» Replies: 0
» Views: 8,010
|
Inbuilt Capability of VC...
Forum: BDB - Platform
Last Post: shivani.jaipuria
12-27-2022, 05:23 AM
» Replies: 0
» Views: 1,179
|
Can dataset/cube refresh...
Forum: BDB - Platform
Last Post: shivani.jaipuria
12-27-2022, 05:08 AM
» Replies: 0
» Views: 1,220
|
How to load business stor...
Forum: BDB Designer Q & A
Last Post: sariga.vr@bdb.ai
12-26-2022, 04:47 PM
» Replies: 0
» Views: 3,171
|
How to load business stor...
Forum: BDB Designer Q & A
Last Post: sariga.vr@bdb.ai
12-26-2022, 04:46 PM
» Replies: 0
» Views: 3,228
|
How to load business stor...
Forum: BDB Designer Q & A
Last Post: sariga.vr@bdb.ai
12-26-2022, 04:45 PM
» Replies: 0
» Views: 2,246
|
How to load business stor...
Forum: BDB Designer Q & A
Last Post: sariga.vr@bdb.ai
12-26-2022, 04:44 PM
» Replies: 0
» Views: 2,225
|
Data Preparation operati...
Forum: BDB-Data Prep & ETL
Last Post: shivani.jaipuria
12-26-2022, 10:09 AM
» Replies: 0
» Views: 1,200
|
Plugability Feature of B...
Forum: BDB Platform Q & A
Last Post: shivani.jaipuria
12-26-2022, 08:32 AM
» Replies: 0
» Views: 1,081
|
How to use environment va...
Forum: BDB Platform Q & A
Last Post: archana
12-26-2022, 05:57 AM
» Replies: 0
» Views: 1,082
|
|
|
How to connect data pipelines and dashboard in BDB |
Posted by: manjunath - 12-23-2022, 06:12 AM - Forum: BDB Data Pipeline Q & A
- No Replies
|
|
To connect a data pipeline to a dashboard in BDB, you will need to consider the specific requirements of your system and the tools and technologies that you are using. Here are a few options that you might consider: - Direct connection: One option is to use the BDB API to connect the data pipeline directly to the dashboard. This can be done by configuring the data pipeline to send data to the BDB API as it becomes available, and then using the BDB API to retrieve the data and display it in the dashboard.
- Scheduled updates: Another option is to use the BDB scheduling feature to schedule regular updates to the dashboard. This can be done by setting up the data pipeline to run on a schedule, and then using the BDB scheduling feature to refresh the dashboard at regular intervals.
- Data storage: A third option is to store the data from the pipeline in a BDB database or data warehouse, and then connect the dashboard to the data storage system. This approach can allow the dashboard to access a wider range of data,and can make it easier to perform complex queries or aggregations on the data.
Again, the best approach for connecting a data pipeline to a dashboard in BDB will depend on the specific needs and constraints of your system. It may be necessary to experiment with different approaches to find the one that works best for your use case.
|
|
|
Dashboard Object properties |
Posted by: mohd.gulam - 12-23-2022, 06:04 AM - Forum: BDB Designer Q & A
- No Replies
|
|
Dashboard Object Properties .
i) Hide All : By clicking on this icon, all the components which are present in the dashboard will be hidden. Users can hide the individual components by enabling the checkbox provided next to the name of the component.
ii) Lock All : By clicking on this icon, all the components which are present in the dashboard will be locked, i.e., means components cannot be moved from one place to another. Users can lock the components individually by enabling the checkbox provided next to the name of the component.
iii) Remove : By Clicking on this icon, all the components which are present in the dashboard can be deleted. Users can delete the components individually by clicking the remove icon provided next to the name of the component.
iv) Duplicate : By clicking on this option duplicate component gets produced on the canvas.
v) Search: When the number of dragged charting components increase in a dashboard, it becomes difficult for users to find out the specific component. Use the ‘Search’ bar to search for the desired components.
|
|
|
what is hyperparameters tuning in ml models |
Posted by: manjunath - 12-23-2022, 06:04 AM - Forum: DS- Lab Q&A
- No Replies
|
|
Hyperparameter tuning is the process of adjusting the hyperparameters of a machine learning (ML) model in order to optimize its performance. Hyperparameters are parameters that are set prior to training an ML model and are not learned during the training process. They control the behavior of the model and can have a significant impact on its performance.
Hyperparameter tuning involves selecting the optimal values for the hyperparameters of the model based on the characteristics of the data and the goals of the analysis. This can be done manually by trying out different values for the hyperparameters and evaluating the model performance, or it can be automated using tools such as grid search or random search.
Hyperparameter tuning is an important step in the ML model development process because it can significantly improve the performance of the model. It is especially important for complex models, such as deep learning models, which have many hyperparameters that can be adjusted.
|
|
|
How do we handle outliers in dataset |
Posted by: manjunath - 12-23-2022, 06:02 AM - Forum: DS- Lab Q&A
- No Replies
|
|
Outliers are data points that lie outside the normal range of values in a dataset. They can have a significant impact on statistical analyses and can often distort the overall pattern of the data, so it is important to identify and handle them appropriately.
There are several different ways to handle outliers in a dataset, and the best approach will depend on the specific characteristics of the data and the goals of the analysis. Some common options include:
- Ignoring the outlier: If the outlier is an isolated point and does not fit the pattern of the rest of the data, it may be best to simply ignore it and focus on the rest of the data.
- Transforming the data: Sometimes, outliers can be caused by the scale of the data. For example, if one variable is measured in dollars and another is measured in cents, the scale of the data will be very different and this could lead to outliers. In these cases, it may be helpful to transform the data to a common scale (such as converting all variables to percentages) in order to make it easier to compare the values.
- Clipping the data: Another option is to "clip" the data by setting a maximum or minimum value beyond which data points will be excluded from the analysis. This can be useful if the outlier is a single extreme value that is distorting the overall pattern of the data.
- Imputing the data: If the outlier is a missing value, it may be possible to impute (or estimate) a value for it based on the values of other similar data points. This can be done using a variety of techniques, such as linear interpolation or k-nearest neighbors.
- Identifying and treating the cause: In some cases, outliers may be caused by a specific problem or error in the data collection process. In these cases, it may be helpful to identify and correct the cause of the outlier in order to eliminate it
|
|
|
How do you add datasets in a notebook? |
Posted by: gnanashri - 12-23-2022, 06:01 AM - Forum: DS- Lab Q&A
- No Replies
|
|
You can add datasets for a notebook by entering your project; you'll see a "Datasets" column there. You can then choose whether you want to choose your data set from the Data Sandbox or Data Sets that you have uploaded in your Data Center by checking the appropriate box.
|
|
|
How do we balance the imbalanced dataset |
Posted by: manjunath - 12-23-2022, 06:00 AM - Forum: DS- Lab Q&A
- No Replies
|
|
An imbalanced dataset is a dataset in which one class is significantly more prevalent than the other class(es). This can be a problem when training machine learning models, as the model may be more accurate at predicting the more prevalent class, while struggling to accurately predict the minority class(es). This can lead to poor overall performance, particularly for the minority class(es).
There are several techniques that can be used to balance an imbalanced dataset:
· Oversampling the minority class: This involves generating synthetic data points for the minority class to increase its prevalence in the dataset.
· Undersampling the majority class: This involves randomly selecting a subset of the majority class data points to reduce its prevalence in the dataset.
· Generating synthetic data points: This involves using algorithms to generate new data points that are similar to the existing data points in the minority class.
· Weighting the classes: This involves assigning higher weights to the minority class data points when training the model, which can help to improve the model's performance on the minority class.
· Using a specialized algorithm: There are some machine learning algorithms, such as those based on decision trees, that are designed to handle imbalanced datasets more effectively.
The specific technique that is most appropriate for a particular dataset will depend on the characteristics of the data and the goals of the analysis.
|
|
|
what is clustering models and how do we measure the performance of models |
Posted by: manjunath - 12-23-2022, 05:58 AM - Forum: DS- Lab Q&A
- No Replies
|
|
Clustering models are a type of machine learning model that are used to group data points into clusters based on their similarity. They are commonly used in applications such as customer segmentation, text classification, and image segmentation.
There are many different types of clustering models, including k-means clustering, hierarchical clustering, and density-based clustering. The specific type of clustering model that is most appropriate for a particular problem will depend on the characteristics of the data and the goals of the analysis.
To measure the performance of a clustering model, there are several metrics that are commonly used. Some common clustering evaluation metrics include:
· Homogeneity: This is a measure of how pure the clusters are, with a value of 1 indicating that all data points within a cluster belong to the same class and a value of 0 indicating that the clusters are mixed.
· Completeness: This is a measure of how well the data points within a cluster belong to the same class, with a value of 1 indicating that all data points within a cluster belong to the same class and a value of 0 indicating that the clusters are mixed.
· V-measure: This is the harmonic mean of homogeneity and completeness.
· Adjusted Rand Index (ARI): This is a measure of the similarity between the clusters and the true labels of the data points, with a value of 1 indicating a perfect match and a value of 0 indicating no match.
· Silhouette score: This is a measure of the separation between the clusters, with a value of 1 indicating a strong separation and a value of -1 indicating a poor separation.
These metrics can be used to compare the performance of different clustering models and to determine which model is the best fit for the data
|
|
|
How to use AutoML |
Posted by: abhishek_acharya - 12-23-2022, 05:53 AM - Forum: DS- Lab Q&A
- No Replies
|
|
How to use AutoML.
To create any model for any local file upload that local file in data sandbox and open ds project in that open dataset section
click on add dataset and in data source section select data sandbox select the required file which is uploaded in data sandbox
and click on add That file will be added in dataset.
Then again go to dataset click on create experiment of the required dataset and give experiment name, description and target column
click on next after that select wheather it is classification or regression dataset then click on done the experiment will be created
U can see those created experiment in AutoML section.
|
|
|
API ingestion component usage.. |
Posted by: mohd.gulam - 12-23-2022, 05:46 AM - Forum: BDB Data Pipeline Q & A
- No Replies
|
|
*Basically API means An application programming interface (API) is a way for two or more computer programs to communicate with each other.
* it is a type of a software interface which offers a service to other piece of software
*API Ingestion component can be used in a pipeline workflow as an input component after component configuration .
* The users can use the Component Ingestion URL with the following format in the program or add it anywhere in the third-party portal such as POSTMAN,etc in JSON format..
configuration:
* Ingestion_id and inestion_secret should be entered then ingestion type should be API INGESTION or WEB hook based on requirement..
*Then save it to get Component instance id URL,same configuration has to be done in POSTMAN app to send data to a api ingestion component in pipeline,pipeline should be active before sending the data from third party app..
|
|
|
|