Welcome, Guest |
You have to register before you can post on our site.
|
Latest Threads |
What is the maximum numbe...
Forum: BDB Designer Q & A
Last Post: sariga.vr@bdb.ai
12-28-2022, 07:59 AM
» Replies: 0
» Views: 8,011
|
Inbuilt Capability of VC...
Forum: BDB - Platform
Last Post: shivani.jaipuria
12-27-2022, 05:23 AM
» Replies: 0
» Views: 1,182
|
Can dataset/cube refresh...
Forum: BDB - Platform
Last Post: shivani.jaipuria
12-27-2022, 05:08 AM
» Replies: 0
» Views: 1,221
|
How to load business stor...
Forum: BDB Designer Q & A
Last Post: sariga.vr@bdb.ai
12-26-2022, 04:47 PM
» Replies: 0
» Views: 3,172
|
How to load business stor...
Forum: BDB Designer Q & A
Last Post: sariga.vr@bdb.ai
12-26-2022, 04:46 PM
» Replies: 0
» Views: 3,231
|
How to load business stor...
Forum: BDB Designer Q & A
Last Post: sariga.vr@bdb.ai
12-26-2022, 04:45 PM
» Replies: 0
» Views: 2,250
|
How to load business stor...
Forum: BDB Designer Q & A
Last Post: sariga.vr@bdb.ai
12-26-2022, 04:44 PM
» Replies: 0
» Views: 2,228
|
Data Preparation operati...
Forum: BDB-Data Prep & ETL
Last Post: shivani.jaipuria
12-26-2022, 10:09 AM
» Replies: 0
» Views: 1,202
|
Plugability Feature of B...
Forum: BDB Platform Q & A
Last Post: shivani.jaipuria
12-26-2022, 08:32 AM
» Replies: 0
» Views: 1,083
|
How to use environment va...
Forum: BDB Platform Q & A
Last Post: archana
12-26-2022, 05:57 AM
» Replies: 0
» Views: 1,083
|
|
|
How to save data as Model? |
Posted by: sariga.vr@bdb.ai - 12-23-2022, 10:54 AM - Forum: BDB Data Pipeline Q & A
- No Replies
|
|
from Notebook.DSNotebook.NotebookExecutor import NotebookExecutor
nb = NotebookExecutor()
saved_model = nb.save_model(model = 'model', modelName = 'cluster_model.pkl', modelType = 'ml')
Using this code we can save the data as model. Here ‘modelName’ represents the name with which we are saving.
But in order to use it as model we have to register it.
In order to register it as Model
Go to the option called Models which is available in Right side panel of DS lab.
There go to 3dots which is right next to our model name. There we have to select register option.
|
|
|
What is a Kafka event? |
Posted by: gnanashri - 12-23-2022, 10:53 AM - Forum: BDB Data Pipeline Q & A
- No Replies
|
|
An event holds data that is coming from a producer an also sends data to the consumer, it is an intermediary stage where data remains to live, you can specify the amount of time the data can stay in the event.
|
|
|
Best practice to write SQL queries |
Posted by: ArghaPratim - 12-23-2022, 10:53 AM - Forum: BDB Designer Q & A
- No Replies
|
|
· Use Uppercase for the Keywords
Avoid
select id, name from company.customers
Prefer
SELECT id, name FROM company.customers
· Use Snake Case for the schemas, tables, columns
Avoid
SELECT Customers.id,
Customers.name,
COUNT(WebVisit.id) as nbVisit
FROM COMPANY.Customers
JOIN COMPANY.WebVisit ON Customers.id = WebVisit.customerId
WHERE Customers.age <= 30
GROUP BY Customers.id, Customers.name
Prefer
SELECT customers.id,
customers.name,
COUNT(web_visit.id) as nb_visit
FROM company.customers
JOIN company.web_visit ON customers.id = web_visit.customer_id
WHERE customers.age <= 30
GROUP BY customers.id, customers.name
· Use aliases when it improves readability
Avoid
SELECT customers.id,
customers.name,
customers.context_col1,
nested.f0_
FROM company.customers
JOIN (
SELECT customer_id,
MIN(date)
FROM company.purchases
GROUP BY customer_id
) ON customer_id = customers.id
Prefer
SELECT customers.id,
customers.name,
customers.context_col1 as ip_address,
first_purchase.date as first_purchase_date
FROM company.customers
JOIN (
SELECT customer_id,
MIN(date) as date
FROM company.purchases
GROUP BY customer_id
) AS first_purchase
ON first_purchase.customer_id = customers.id
· Formatting: Carefully use Indentation & White spaces
Avoid
SELECT customers.id, customers.name, customers.age, customers.gender, customers.salary, first_purchase.date
FROM company.customers
LEFT JOIN ( SELECT customer_id, MIN(date) as date FROM company.purchases GROUP BY customer_id ) AS first_purchase
ON first_purchase.customer_id = customers.id
WHERE customers.age<=30
Prefer
SELECT customers.id,
customers.name,
customers.age,
customers.gender,
customers.salary,
first_purchase.date
FROM company.customers
LEFT JOIN (
SELECT customer_id,
MIN(date) as date
FROM company.purchases
GROUP BY customer_id
) AS first_purchase
ON first_purchase.customer_id = customers.id
WHERE customers.age <= 30
· Avoid Select *
Avoid
SELECT * EXCEPT(id) FROM company.customers
Prefer
SELECT name,
age,
salary
FROM company.customers
· Go for the JOIN Syntax
Avoid
SELECT customers.id,
customers.name,
COUNT(transactions.id) as nb_transaction
FROM company.customers, company.transactions
WHERE customers.id = transactions.customer_id
AND customers.age <= 30
GROUP BY customers.id, customers.name
Prefer
SELECT customers.id,
customers.name,
COUNT(transactions.id) as nb_transaction
FROM company.customers
JOIN company.transactions ON customers.id = transactions.customer_id
WHERE customers.age <= 30
GROUP BY customers.id, customers.name
· Sometimes, it might be worth splitting into multiple queries
Instead Of
CREATE TABLE customers_infos AS
SELECT customers.id,
customers.salary,
traffic_info.weeks_since_last_visit,
category_info.most_visited_category_id,
purchase_info.highest_purchase_value
FROM company.customers
LEFT JOIN ([..]) AS traffic_info
LEFT JOIN ([..]) AS category_info
LEFT JOIN ([..]) AS purchase_info
You Could Use
## STEP1: Create initial table
CREATE TABLE public.customers_infos AS
SELECT customers.id,
customers.salary,
0 as weeks_since_last_visit,
0 as most_visited_category_id,
0 as highest_purchase_value
FROM company.customers
## STEP2: Update traffic infos
UPDATE public.customers_infos
SET weeks_since_last_visit = DATE_DIFF(CURRENT_DATE,
last_visit.date, WEEK)
FROM (
SELECT customer_id, max(visit_date) as date
FROM web.traffic_info
GROUP BY customer_id
) AS last_visit
WHERE last_visit.customer_id = customers_infos.id
## STEP3: Update category infos
UPDATE public.customers_infos
SET most_visited_category_id = [...]
WHERE [...]
## STEP4: Update purchase infos
UPDATE public.customers_infos
SET highest_purchase_value = [...]
WHERE [...]
· Meaningful names based on your own conventions
· Finally, write useful comments… but not too much
|
|
|
Utility tab |
Posted by: aishwarya.rajan@bdb.ai - 12-23-2022, 10:53 AM - Forum: DS- Lab Q&A
- No Replies
|
|
Q.) What is a use of an utility tab in DS lab?
A: This tab allows the users to create and list the python scripts (.py files) that can be imported to the respective notebook accordingly to meet the requirements.
|
|
|
Configure Email Component in Pipeline |
Posted by: ArghaPratim - 12-23-2022, 10:50 AM - Forum: BDB - Data Pipeline
- No Replies
|
|
The Email Component helps to configure the Email content together with the Email server and
recipients.
i) Drag and drop the Email Component to the Pipeline Workflow Editor.
ii)
Click the dragged Email component to get the Configuration fields:
iii)
The Basic Information tab opens by default.
a. Select the invocation type (Real-Time/Batch)
b. Deployment Type: It comes preselected based on the component.
c. Container Image Version: It comes preselected based on the component.
d. Failover Event: Select a failover Event from the drop-down menu.
e. Batch Size (min 10): Provide the maximum number of records to be processed in one
execution cycle (Min limit for this field is 10).
iv)
Open the Meta Information tab and provide the connection-specific details.
a. Subject (*): Subject of the Email
b. Text Message: Provide the email content in the given space.
c. Attachment: Select an option out of ‘Yes’ or ‘No’.
d. Receivers (*): Provide receivers' email ids separated by commas.
e. Host (*): Provide host details.
f. TLS: Enable or disable TLS by putting a checkmark in the given box.
g. Email Username: Provide the name of the email user.
h. Encryption Type: Select an encryption type from the drop-down menu.
i.
Email Password: Provide the password of the email
j.
Enable SSL: Enable SSL by putting a checkmark in the given box.
k. Email From: Provide the email address of the senders.
l.
Email Port: Provide the Port number of the email.
m. Disable Email Sending: Put a tick mark to disable the sending of the notification emails.
n. Email Input: Provide the following details.
1) Path
2) Input Variable
3) Data Type
v) Click the ‘Save Component in Storage’
icon.
vi)
A message appears to notify the action.
vii) The Email Component is ready to be used in the Pipeline workflow. It can read data from an
input event and pass the processed data to an output event, so accordingly connect the
required event components to create a pipeline workflow.
|
|
|
|