Create Transformer
Creating a transformer can be done simply by following these steps:
- 
Define default group keys for the input sf. To do this, you must set
DEFAULT_GROUP_KEYS.class CustomTransformer(Transformer):
DEFAULT_GROUP_KEYS = {"default": "default", "address": "address"}The
DEFAULT_GROUP_KEYSare set to thegroup_keysby default. If you want to use different names, add thegroup_keysto the constructor when you use it. - 
If you need more than one raw data for your transformation and all of them must exist and cannot be None, set
ONLY_GROUPtoTrue.class CustomTransformer(Transformer):
DEFAULT_GROUP_KEYS = {"default": "default", "address": "address"}
ONLY_GROUP = TrueThe default value of
ONLY_GROUPisFalse. - 
Override the validate method. If you need to validate that the input sf must have specific columns, you can use the
_validate_columnsmethod.class CustomTransformer(Transformer):
DEFAULT_GROUP_KEYS = {"default": "default", "address": "address"}
ONLY_GROUP = True
def validate(self, sf: SFrame):
super().validate(sf)
self._validate_columns(sf, self.default_sf_key, "column_1", "column_2") - 
Set
HANDLER_NAMEto your preferred value. For example, you can choosederivefor derivers andtrimfor trimmers.class CustomTransformer(Transformer):
DEFAULT_GROUP_KEYS = {"default": "default", "address": "address"}
ONLY_GROUP = True
def validate(self, sf: SFrame):
super().validate(sf)
self._validate_columns(sf, self.default_sf_key, "column_1", "column_2") - 
Implement methods based on the input raw format. The method name should follow this rule:
HANDLER_NAME + _ + FRAME_NAMEFor example, FRAME_NAME for pandas is
dfand for pyspark isspf.class CustomTransformer(Transformer):
DEFAULT_GROUP_KEYS = {"default": "default", "address": "address"}
ONLY_GROUP = True
def validate(self, sf: SFrame):
super().validate(sf)
self._validate_columns(sf, self.default_sf_key, "column_1", "column_2")
def transform_df(default: pd.DataFrame, address: pd.DataFrame, *args, **kwargs): ... - 
Return a dictionary from the handler method, so that the keys match the
group_keysand the values are raw data.class CustomTransformer(Transformer):
DEFAULT_GROUP_KEYS = {"default": "default", "address": "address"}
ONLY_GROUP = True
def validate(self, sf: SFrame):
super().validate(sf)
self._validate_columns(sf, self.default_sf_key, "column_1", "column_2")
def transform_df(default: pd.DataFrame, address: pd.DataFrame, *args, **kwargs):
# your transformation implementation ....
return {"default": default, "address": address}